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Abstract. In this paper, we consider Langcvin processes with mechanical 
constraints. The latter are a fundamental tool in molecular dynamics simula- 
tion for sampling purposes and for the computation of free energy differences. 
The results of this paper can be divided into three parts, (i) We propose a 
simple discretization of the constrained Langevin process based on a splitting 
strategy. We show how to correct the scheme so that it samples exactly the 
canonical measure restricted on a submanifold, using a Metropolis-Hastings 
correction in the spirit of the Generalized Hybrid Monte Carlo (GHMC) algo- 
rithm. Moreover, we obtain, in some limiting regime, a consistent discretiza- 
tion of the overdamped Langevin (Brownian) dynamics on a submanifold, 
also sampling exactly the correct canonical measure with constraints, (ii) 
For free energy computation using thermodynamic integration, we rigorously 
prove that the longtime average of the Lagrange multipliers of the constrained 
Langevin dynamics yields the gradient of a rigid version of the free energy 
associated with the constraints. A second order time discretization using the 
Lagrange multipliers is proposed, (iii) The Jarzynski-Crooks fluctuation re- 
lation is proved for Langevin processes with mechanical constraints evolving 
in time. An original numerical discretization without time discretization error 
is proposed, and its overdamped limit is studied. Numerical illustrations are 
provided for (ii) and (iii). 



1. Introduction and main results 

Free energy is a central concept in thermodynamics and in modern works on bio- 
chemical and physical systems. Typical examples studied by computer simulations 
include the solvation free energies (which is the free energy difference between a 
molecule in vacuo and its counterpart surrounded by solvent molecules) and the 
binding free energy of two molecules (which determines whether a new drug can 
have an efficient action on a given protein). In many applications, it is actually 
the free energy difference profile between the initial and the final state which is a 
quantity of paramount importance. It is observed by practitioners that free energy 
barriers arc a very important element to describe transition kinetics from one state 
to the other. For instance, the chemical kinetics of reactions happening in solvent 
(such as in the cells of our bodies) are limited by free energy barriers, and can 
take place only when the free energy difference between the initial and the final 
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state is negative, or at least less than the typical thermal energy. It is therefore 
very important to accurately compute free energy differences in order to assess the 
likelihood of a certain physical event to happen. 

Beside these physical motivations to compute free energy differences, a more 
abstract motivation is to overcome sampling barriers encountered when computing 
canonical averages (see the discussion in [35l Section 1.3.3]). Indeed, it is often 
the case in practice that the trajectories generated by the numerical methods at 
hand remain stuck for a long time in some region of the phase space, and hop only 
occasionally to another region, where they also remain stuck - a behavior known as 
metastability. Chemical and physical intuitions may guide the practitioners of the 
field towards the choice of some slowly evolving degree of freedom, called reaction 
coordinate in the following, responsible for the metastable behavior of the system. 
In this case, free energy techniques can be used to accelerate the sampling. This 
viewpoint allows to consider applications which are not motivated by physical or 
biological problems, such as curing sampling issues in Bayesian statistics [7]. 

In this introductory section, we present the main results and give the outline 
of the paper, highlighting the three main contributions of this work. We only 
briefly define the concepts we need in this general introduction, and refer the reader 
to the following sections (in particular Section [J) for further precisions on the 
mathematical objects at hand. 

1.1. General setting for molecular dynamics with constraints. We con- 
sider mechanical systems with constraints. The configuration of a classical A'^-body 
system is denoted by {q^p) G M^^. The results of the paper can be generalized 
mutatis mutandis to periodic boundary conditions (g S T"^^, where T = R/Z de- 
notes the one dimensional torus) , or to systems with positions confined in a domain 
g S I? C M'^^. The mass matrix of the system is assumed to be a constant strictly 
positive symmetric matrix M. One could typically think of a diagonal matrix 
M = Diag(mild3, • • • ,mArId3) € R-^^^-^^. The interaction potential is a smooth 
function V : M.^^ R. The Hamiltonian of the system is assumed to be separable: 

H{q,p)^]^p^M-^p + V{q). 

In the present paper, the focus is on the canonical ensemble, which is the equilibrium 
probability distribution of microscopic states of a system at fixed temperature (fixed 
average energy). For systems without constraints, this ensemble is characterized 
by the probability distribution 

(1.1) p,{dqdp) = Z-^e~f^"'''^-PUqdp, Z^l e-^", 

where Z is the normalizing constanlQ ensuring that ^ is indeed a probability distri- 
bution, and /? = (kBT)^^ is proportional to the inverse temperature. One dynamics 
which admits the canonical measure (|1.1|) as an invariant measure is the Langevin 
dynamics (see for instance [351 Section 2.2.3] and references therein): 

( dqt= M-^ptdt, 

[dpt= -\/V{qt)dt - j{qt)M'^pt dt + a{qt) dWt, 



The potential V is assumed to be such that Z < oo. 
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where Wt is a standard SA^-dimensional Brownian motion, and 7(9), ^(g) are 3A^ x 
position dependent real matrices which are assumed to satisfy the fluctuation- 
dissipation identity 

(1-3) <r{q)<J^{q)^^l{q)- 

The Langevin dynamics can be seen as some modification of the Hamiltonian dy- 
namics with two added components: a damping term —j{qt)M~^pt dt and a random 
forcing term a{qt)dWt. The energy dissipation due to the damping is compen- 
sated by the random forcing in such a way that the temperature of the system 
is T = (fcs/?)"^ (with fee Boltzmann's constant). 

We will consider positions subject to a m-dimensional mechanical constraint 
denoted by 

e('z) = (ei(<z),.-.,ern(g))'^-^eR". 

As will become clear below, constrained systems appear in computational statistical 
physics in two kinds of contexts (see e.g. [35J Chapter 10], and [TH |3S] for appli- 
cations to the computation of free energy differences, and [31 for mathematical 
textbooks dealing with constrained Hamiltonian dynamics): 

(i) for free energy computations, where ^ is a given reaction coordinate parame- 
terizing a transition between " states" of interest ; 

(ii) when the system is subject to molecular constraints such as rigid covalent 
bonds, or rigid bond angles in molecular systems. 

In the sequel, ^ may be thought of at first reading as a reaction coordinate (case 
(i)). Section HTT] explains how to handle additional molecular constraints (case (ii)) 
within the same formalism. In any case, the position of the system is constrained 
onto the submanifold of co-dimension m: 

(1.4) I](z) = {q e I C(g) - z}, 

and the associated phase space is the cotangent bundle denoted by 

(1.5) T*S(z) = [{q,p) G R^^ I g e S(z), V^{qfM-^p = o}. 
For a given g G 5](z), the set of cotangent momenta is denoted by 

(1.6) r;i](z) = |p G R^^ Vi{qf M~^p = {)y 

The orthogonal projection on T*Ti{z) with respect to the scalar product induced 
by M"-"^ is denoted 

(1-7) PM{q) = Id - Ve(<z) G-^l{q)Vi{qfM-\ 

where Gm{<i) is the Gram matrix associated with the constraints 

(1.8) GM{q)^^aqfM-'^aq)- 

Throughout the paper, we assume that Gm is invertible everywhere on Y,{z) (for 
all z). It is easily checked that Pm satisfies the projector property PM{q)^ = Puiq), 
and the orthogonality property 

M-'PAiiq) = PM{qfM-\ 
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1.2. The constrained Langevin dynamics. For constrained systems, the asso- 
ciated canonical distribution is defined by 

(1.9) f^T'Ei.){dqdp) = Z-ie-^^(«'J')(7T.E(.)(rfgdp), 

where crx-'s{z)idq dp) is the phase space Liouville measure of T*T,{z), and Z^^ the 
normalizing constant (z refers to the position constraint, and to the velocity or 
momentum constraint, see (|2.15p below). See Section [2?3l for precise definitions. 

A dynamics admitting the constrained canonical measure (|1.9|) as an invari- 
ant equilibrium measure is the following Langevin process ( "CL" stands for "con- 
strained Langevin"): For a given initial condition (907^0) G T*T,{z), 



(CL) 



dqt ^ M ^pt dt, 

dpt = -VV{qt) dt - -l{qt)M-^pt dt + cj{qt) dWt + V^iqt) dXt, 



where the K^-valued adaptecfl process i !-> At is the Lagrange multiplier asso- 
ciated with the (vectorial) constraint (Cg), and 'y{q),o-{q) are again assumed to 
satisfy (fO)) . Note that {qt,Pt) G T*I](z) for aU t > 0. Then, averages of an 
observable A : M with respect to the distribution (|1.9|) can be obtained 

as longtime averages along any trajectory of the dynamics (CL) (when Pm'jP^j is 
symmetric positive on S(z)): 

(1.10) lim ^[ A{qt,pt)dt= [ Ad^T'Y.(z) a.s. 

This is made precise in Section 121 Several recent studies {e.g. [231 [551 HSl H]) 
have analyzed dynamics similar to (CL) and some appropriate discretization of the 
process in order to approximate the left-hand side of (jl.lOp . 

The first contribution of our work is to propose a simple discretization of the 
dynamics (CL) and to highlight its remarkable properties. The numerical scheme 
is based on a splitting strategy between the Hamiltonian and the thermostat part, 
see Equations p.l6p - p.l7|) - p.l8p below (in the spirit of the scheme proposed in [J] 
in the unconstrained case). The Hamiltonian part is discretizcd using a Verlet 
scheme with position and momentum constraints (the so-called RATTLE scheme, 
see )• We show that this discretization enjoys the following properties: (i) for 
some choice of the parameters, an Euler discretization of the overdamped Langevin 
dynamics (also called Brownian dynamics) with a projection step associated with 
the constraints is obtained (see Equation (|3.23p and Proposition 13. 6[) ; (ii) it can 
be completed by a Metropolis-Hastings correction to obtain a Generalized Hybrid 
Monte Carlo (GHMC) method sampling exactly {i.e. without any bias due to 
time-discretization) the constrained canonical distribution ()1.9|) (see Algorithm 13.51 
below) . The so-obtained numerical scheme is close to the ones proposed in [5^1 [Ml 
[23] . See also [16l[36] for historic references on Hybrid Monte Carlo methods, and [28] 
for GHMC. One output of this part is thus a new Metropolization procedure for 
overdamped Langevin dynamics to sample, without bias, measures with support a 
submanifold. 



'i.e. a random variable depending only on the past values of the Browrnian motion. 



LANGEVIN DYNAMICS WITH CONSTRAINTS 



5 



1.3. Free energy computations. The free energy F : M™ — > M associated with 
the reaction coordinates ^ : M.^^ — > M'" is defined as — /3^^ times the log-density 
of the marginal probability distribution of the reaction coordinates ^ under the 
canonical distribution (|l.ip . Explicitly, it is defined through the following relation: 
for any test function : R™ — > M, 

(1.11) / 4>{z)e-P^^^Uz= f mq))Kdqdp). 

In words, e~^^^^^ dz is the image of the measure fi by ^, and F can be seen as an 
"effective potential energy" associated to ^. 

Computing the free energy profile z i-^ F{z) (up to an additive constant inde- 
pendent of z), or free energy differences between two states F{z2) — F{zi) is a way 
to compare the relative probabilities of different "states" parameterized by ^. This 
is a very important calculation for practical applications, see [SI [35]. A state should 
be understood here as the collection of all possible microscopic configurations {q,p), 
distributed according to the canonical measure (jl.ip . and satisfying the macroscopic 
constraint ^(q) = z. Since we only focus on computing free energy differences, F 
is defined up to an additive constant (independent of z, denoted by C below, and 
whose value may vary from line to line) and can be rewritten as: 

Fiz)^-^ In / e-P^'^'^'P^ <5^(,)„, (dq) dp 

= -^ln/ e-^^(«)%,)_,(dq) + C, 

P JS(2) 

where (5^(g)_^ denotes the conditional measure on T,(z) verifying the following iden- 
tity of measures in M.^^ : dq = (5j(q)„r((i(7) dz (sec Section [231 for more precisions on 
this relation). 

However, when using constrained simulations in phase space, the momentum 
variable of the dynamical system is also constrained, and a modified free energy 
(called "rigid free energy" in the sequel, see Remark p.3p below for a justification 
of the term "rigid") is more naturally computed, see Section [3] The latter is defined 
as 

(1.13) ^^*i,(z) = -iln / e-^«(^^f)aT.s(.)(dgdp). 

The superscript M indicates that this free energy depends on the considered mass 
matrix, even though this is not clear at this stage (see (|4.3p below). The above two 
definitions of free energy are related through the identity: 

(1.14) F{z) ~ = -4 In / (det GM)-'/'d/^T.s(.) + C, 

P JT'Y.(z) 

where ^it*^(z) is the equilibrium distribution with constraints (|1.9p . The rela- 
tion (|1.14[) . already proposed in [TJ (see also [TU [TSl [331 [2S] for related formulas), 
is proved at the beginning of Section[3| For any value of the reaction coordinate, the 
difference F{z) — F^^^{z) can then be easily computed with any method sampling 
the probability distribution fJ.T*^{z), such as (CL). 

Several methods have been suggested in the literature to compute either F or 
F*^j from the La grange multipliers of a constrained process similar to (CL). We 
refer for instance to [H] (and references therein) for the Hamiltonian case, and 
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to [9] (and references therein) for the overdamped case. The second contribution 
of this paper is twofold: (i) we rigorously prove that the longtime average of the 
Lagrange multipliers in (CL) converges to the gradient of the rigid free energy (|1.13p 
(the so-called mean force); and (ii) we then show that the latter mean- force can be 
computed with second order accuracy {i.e. up to O(Ai^) error terms, where At is 
the time-step) using the Lagrange multipliers involved in the Hamiltonian part of 
the splitting scheme. 

More precisely, the first point (i) amounts to showing that 

(1.15) hm i rdAt = V.^^//d(z) a.s. 

As compared to [12], where a formal proof for the Hamiltonian case is proposed, we 
use an explicit calculation that does not require the use of the Lagrangian structure 
of the problem, or a change of coordinates. Once ^zF^gdi-^) is obtained, F^gd{z) 
can be computed (up to an additive constant) by integration. This procedure is 
known as thermodynamic integration. Note that using (jl.lSp and thermodynamic 
integration, together with ()1.14p . allows to obtain F{z) without computing second 
order derivatives of ^. This is a desirable property since computing such high 
derivatives may be cumbersome for some reaction coordinates used in practice. 
Straightforward computations of the mean force using analytical expressions (see 
for instance (|4.1ip - (|4.12p ) usually involve such high order derivatives. 

The second point (ii) is then based on a discretization of a variant of (|1.15|) . 
obtained by subtracting the martingale part of the Lagrange multipliers. This 
amounts to averaging the two Lagrange multipliers involved in the RATTLE part 
of the scheme, see (|4.18p ). 

We also discuss how these techniques can be generalized to compute the free 
energy for systems with molecular constraints, see Section [4. II 

1.4. Jarzynski- Crooks relations and nonequilibrium computations of the 
free energy. The last part of this article is devoted to nonequilibrium methods for 
free energy computations, based on a Hamiltonian or Langevin dynamics with con- 
straints subject to a predetermined time evolution. Such methods rely on a nonequi- 
librium fluctuation equality, the so-called Jarzynski-Crooks relation. See [29j for a 
pioneering work, as well as [TOl [H] for an extension. They are termed "nonequilib- 
rium" since the transition from one value of the reaction coordinate ^ to another 
one is imposed a priori, in a finite time T, and with a given smooth deterministic 
schedule t E [0, T] H' z{t) e M™. In particular, it may be arbitrarily fast. Therefore, 
even if the system starts at equilibrium, it does not remain at equilibrium. The out- 
of-equilibrium Langevin process we consider to this end is given by the following 
equations of motion ("SCL" stands for "switched constrained Langevin"): 

M-^Pt dt, 

-VV{qt) dt - -ip{qt)M-^pt dt + ap{qt) dWt -f V^{qt) dXu 

{c,{t)) 

where i At S R™ is an adapted process enforcing the constraints {Cq{t)) (the La- 
grange multipliers). Initial conditions are sampled from the phase-space canonical 
distribution defined by the constraints ^{q) = z{0) and v^{q,p) = \^£^{q)^ M^^p = 



{dqt = 
dpt^ 
ait) = 
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i(0) (see (|2.15p ). We restrict ourselves to projected fluctuation-dissipation matrices 
of the specific form 

(1-16) {ap,^p):^iPM<J,PMlPL): 

where j{q),(T{q) e ]^3Afx3Af satisfy the fluctuation-dissipation identity (|1.3|) . Note 
that 7p, ap also verify (jl.3p . Our analysis also applies to deterministic Hamiltonian 
dynamics upon choosing 7 = 0. The dynamics (SCL) is a natural extension of the 
constrained Langevin dynamics (CL). It is different from the dynamics proposed 
in |31| , which is a Langevin dynamics associated with a modified Hamiltonian with 
projected momenta, driven by a forcing term along which acts directly on the 
position variable. As explained below (see (|5.3p and the discussion following this 
equation), the specific choice (|1.16p (rather than considering unprojcctcd matri- 
ces ^{q),cr{q) S K3Afx3/Vj jg^ds to a simpler analysis and more natural numerical 
schemes, based again on a splitting procedure. 

As explained in Section 15.31 it is possible to define the work associated with 
the constraints exerted on the system between time and T as the displacement 
multiplied by the constraining force: 

(1-17) Wo,t ({'7s,pJo<s<t) ■■= I ^^(«) d^s- 

The third contribution of the present paper is twofold: (i) We derive a new general 
Crooks- Jarzynski relation (see Theorem 15.31 below) based on the nonequilibrium 
constrained dynamics (SCL) and the associated work defined in (|1.17p : and (ii) 
An original numerical scheme is proposed, which allows to compute free energy 
differences without time discretization error (see Theorem 15.51 below). More pre- 
cisely, concerning the first point, the main corollary is given by the following result. 
Consider the corrector 

(1.18) C{t,q) = ^ln(detGM(<z)) - ii(t)^G^/(<7)i(t), 

where ^ In det Gi\i (q) is the so-called Fixman term due to the geometry of the po- 
sition constraints (see (|1.14p and Remark and ^z{t)'^G~^}'{q)z{t) is the kinetic 
energy term due to the velocity of the switching. Then, the free energy profile can 
be computed through the following fluctuation identity (see (|5.23p ): 

(1.19) F(z(T)) - F(z(0)) = In ^pm^) 

where the expectation is with respect to canonical (equilibrium) initial conditions 
and for all realizations of the dynamics (SCL). 

The numerical scheme mentioned in the second point (ii) above is based on a 
modification of the splitting scheme used to discretize the constrained Langevin 
dynamics (CL). This modification allows to take into account the evolving con- 
straints. Using the symplecticity of the modified RATTLE scheme, we are able to 
prove a discrete-in-time version of the Crooks relation, and of the associated Jarzyn- 
ski free energy estimator (jl.l9p . Moreover, for some choice of the parameters, the 
latter scheme yields a Jarzynski-Crooks relation for an Euler discretization of the 
overdamped Langevin (Brownian) dynamics with a projection step associated with 
the evolving constraints, without time discretization error. This can be seen as an 
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extension of the scheme formerly proposed in |34] (see Equation (|5.54p and Propo- 
sition [52]) . We also check the consistency of the various free energy estimators we 
introduce. 

1.5. Organization of the paper. We start with an introduction to the mathe- 
matical concepts required for mechanically constrained systems in Section [5] Scc- 
tionOis devoted to the properties and the discretization of mechanically constrained 
Langcvin processes defined by (CL) , and the problem of sampling the canonical dis- 
tribution (|1.9p . Thermodynamic integration with constrained Langevin processes 
is presented in Section |4l Section [5] discusses noncquilibrium constrained Langevin 
processes (SCL) and the associated Jarzynski-Crooks fluctuation identity ()1.19p . 
Finally, some technical lemmas are gathered in Section [HI 



2. Preliminaries 

After making precise our notation for matrices and matrix valued functions in 
Section [2. 11 we introduce some additional concepts required to describe constrained 
systems in Section 12. 2[ and define the phase space measures with constraints in 
Section [231 



V^(g)= V6(9),...,V6n(9) e 



2.1. Notation. Throughout the paper, the following notation is used: 

• Vectors and vector fields are by convention of column type. When vectors 
are written as a line, they should be understood as the corresponding col- 
umn version. For instance, {q,p) € should be understood as {q^ ,p'^)'^ , 
where q,p E K."^^ are both column vectors. 

• Gradients in M.^^'^ (or E^^) of m-dimensional vector fields are by convention 
3A^ X m-matrices, for instance: 

^3Nxm 

where V^i(g) G M'^^ is a column vector for any i = 1, . . . , to. Gradients in 
the space of constraints parameters z G or C G IR^™ are denoted with 
the associated subscripts, namely and V^. 

• Second order derivatives in of TO-dimensional vector fields are charac- 
terized through the Hessian bilinear form: 

(2.1) Hess,(^)(z;i,«2) = : 

where vi,V2 S K'^^ are test vectors. 

• The canonical symplectic matrix is denoted by: 



(2.2) J:= 



IdsAT \ ^ ^6Nx6N 

-Id-m 



For any smooth test functions ipi : M*^^ M"i and ip2 : M"" , the 

Poisson bracket is the ni x n2 matrix 

(2.3) {^i,(/P2} = (V</Pi)^JV^2eK"^^"^ 

• For two matrices A,B e R"^", A : B = Tr(A^B). 
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2.2. Constraints. Contrarily to what is often done in the literature, we avoid 
global changes of variables and the use of generalized coordinates. We observe 
that global changes of variables are not required for the proofs of the theoretical 
results we present, and they are definitely to be avoided in practical numerical 
computations whenever possible. 

Two useful concepts to study constrained Hamiltonian systems (in particular, to 
use the co-area formula in phase space, as well as the Poisson bracket formulation 
of the Liouville equation) are the effective velocity and the effective momentum 

associated with the constrained degrees of freedom ^: 

(2.4) v^iq,p)^vaqfA'r'peW\ 

and 

(2.5) p^{q,p) = G-,l{q)v^{q,p) = G^/ (<7)Ve(g)^A/- V e M". 

The expression of the effective velocity is obtained by deriving the constraint ^ 
along an unconstrained trajectory of the Hamiltonian dynamics 

,t = -v.(,-,,, 

since ^^^^f^ = v^{qt,pt)- The term Gjj{q) in the expression (|2.5p of the effective 
momentum may be interpreted as the effective mass of ^. This can be motivated by 
a decomposition of the kinetic energy of the system into tangential and orthogonal 
parts, using the projector ()1.7|) for a given position q G M.^^ : 

EMp) = \p^M-^p 

= \ P^Pm {qfM- 1 Pm {q)p +\y (Id - Pm {q)fM- ' (id - Pm {q))p. 
The orthogonal part can be rewritten, for any {q,p) G R^^, as: 

EL{<1,p) ■■= ip^(ld-PA/(g))V-i(ld-PM(<z))p 

= ^v^{q,pfGAi{Q)vdl^P) = ^Pdq:PfGM{q)pdl^P)- 

The last equations allow to consider G^/ as some effective mass. 

The constraints on a mechanical system can also be reformulated in the more 
general form 

(2.6) E{q,p) = C e M^", 

where either (i) the effective momentum is constrained, in which case 5 = (C,pj) 
and C = iz,pz); or (ii) the effective velocity is constrained, in which case S = (^, v^) 
and C ~ iz,Vz)- The phase space associated with such constraints is denoted by 

(2.7) I]E(C) = {(<Z,p)eM6~ I S(q,p)=c}- 

A position q € 5](z) being given, the affine space of constrained momenta verify- 
ing (|2.6I) is then denoted by 

(2.8) ^v,(gM^z) = {p e I v^iq,p) = V,} 
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in the effective velocity case, and by I]p^(g..) (pz) in the effective momentum case. 
This notation is very important for nonequihbrium methods where the constraints 
evolve in time according to a predefined schedule, see Scction[5] Note that the phase 
space of mechanical constraints, defined by (|1.5p . is simply T*T.(z) — S^^^,^ (z, 0) = 
Sc,P5(z,0). 

We can now define the skew-symmetric Gram tensor of dimension 2m x 2m 
associated with the constraints: 

(2.9) T{q,p) = {E,E}{q,p) = VS^(g,p) JWE{q,p) e m2™x2™^ 

The Gram matrix F associated with the generalized constraints (j2.6|) can be explic- 
itly computed by block. Indeed, for 5 = {£,,p^)^ , 



(2.10) 



Id 
-Id VpJJVpi 



Therefore, dct(r) = 1 in this case. In the case ^ = (^, v^)'^ , the Gram matrix reads 
(2.11) r = 



Gm 
-Gm VvJ JVv^ 

and det(r) = det(GM)^- Note that in both cases dct(r) > 0. The constrained 
symplectic (skew-symmetric) matrix is now defined by 

(2.12) ,h{q,p) = ,/- JVS(g,p)r-i(g,p)VS^(g,p) J, 

and the Poisson bracket associated with generalized constraints (|2.6[) by: 

(2.13) {v'i,V'2}h = V(^f JhV(^2. 

This Poisson bracket is often called the Dirac bracket in the literature (in reference 
to the seminal work of Dirac (TSl 157]). 

It is easily checked that {•, •}- verifies the characteristic properties of Poisson 
brackets, namely the skew-symmetry. Jacobi's identity, and Leibniz' rule. There- 
fore, the flow associated with the evolution equation 

(2^14) |(«)^J.Vfl(,„p,). 

defines a symplectic map. Recall that (see [22l Section VII. 1.2]) a map : Sh(C) ~^ 
Sh(C) is symplectic if for any {q^p) G Ss(C) and u,v Cz T^^ p)I]s(C), 

u^\7 (j){q, p)'^ J\7 (j){q, p)v — u^Jv. 

A consequence of the symplectic structure is the divergence formula (|2.25|) relating 
the phase space measure (T^^i^Q{dq dp) on Sh(C) (defined below), and the Poisson 
bracket (j2.13p . The reader is referred to Chapter 8 in [3], and Section VII. 1 in [35] 
for more material on constrained systems. 

It will be shown in Proposition 13.11 below that the Poisson system (|2.14p is 
equivalent to (CL) when (7,17) = (0,0). 



2.3. Phase space measures. 
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2.3.1. Definitions. The phase space measure (also termed Liouville measure) on 
the phase space T*'S{z) (or more generahy on I]h(C)) of constrained mechanical 
systems is denoted by or* 2(2) (or more generally a-^^(^Q). The latter is induced by 
the symplectic, or skew-symmetric 2-form on MP^ defined by the canonical skew- 
symmetric matrix J in R^^. More precisely, it can be defined through the volume 
form \dctQ{u{q,p))\^^^ , where 

Ga.b{u) = {uaY Jub, a,h = 1,. . . ,&N - 2m, 

and (ui(g,p), . . . , M6jv-2m(9jP)) is a basis of tangential vectors of the submani- 
fold r*S(z) (or Eh(C)) at a given point {q,p). 

Surface measures induced by scalar products associated with general symmetric 
definite positive matrices will also be of interest. We denote by {dq) the surface 
measure on S(z) induced by the scalar product (g, q) m = q^Mq on M.^^'^ , and, for a 
given q S S(z), by cr^^ ^ ){dp) and ^ ^(•^ ){dp) the surface measures on the 
affine spaces Sp^(q .^(pz) and .)(u2) respectively, induced by the scalar product 
(p,p)jv/-i = p^M~^p on M.^^ . For more precise definitions of these measures, we 
refer to [35l Sections 3.2.1 and 3.3.2] and the references therein. 

It is now possible to define a generalization of the canonical distribution (|1.9|) as 
follows: 



(2.15) 



Q-0H{q.,p) 

^Az,v,){dqdp) := — (TSj ^(z,v^){dqdp), 



Az,v,) 



e daY,^^^^{z,v,)- 



T 



The distribution p.l5p is associated with the generalized constraints S = (^, 
and is used in Scction[5]for noncquilibrium methods. Note that ^y^^ ^ q) = fJ-T*j:{z) 
defined in ([Ol) . 

2.3.2. Co-area decompositions. The co-area formula (see [21 120]) relates the phase 
space or surface measures, and the conditional measures. Conditional measures 
are defined in M.^^ by the following conditioning formula: for any test function cj) : 



(2.16) / (j}{q,p)dqdp= I / <P{q,p) 5E.{q.p)~c{dqdp) dC,. 

In the same way in R'^^, conditional measures are defined, for any test function 
^R, by 



(2.17) / mdq^ / / mh(<l)-z^dq)dz. 

A more concise notation for the above equalities is dqdp = 5'^(q.p)-Q{dq dp) dC, and 
dq = 6^(q)_z{dq) dz. 

Proposition 2.1 (Co-area). Let S(z) he the submanifold (|1.4p defined by the con- 
straints ^(g) = z, and assume that Gm defined in (jl.Sp is non- degenerate in a 
neighborhood ofYi{z). Then, in the sense of measures on R^^; 

(2.18) h{<i)-z{dq) = (detM)"^/' |detGM(g)r'^V^^,)(dg). 
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Let be the phase space defined by generalized constraints p.6p . Assume that T 

defined in p.9p is non- degenerate in a neighborhood o/I]h(C). Then, in the sense 
of measures on MP^ : 

(2.19) Ss{q,p)-.cid'ldp) ^ \dctr{q,p)\ a^-^Q{dqdp). 

We refer for example to Chapter 3 in |35j for an elementary proof. An equivalent 
of p.l8|) - p.l7|) for momenta reads, for constrained effective momenta: 

(2.20) dp = <5p,(,,rt_pjdp) dp, = dct(A/)i/2 |det GMiq)\'^^ ^€,Imp^)^'^P^ '^^^ 
and for constrained effective velocities: 

dp = 5^^(^q.,p)-vAdp) dv;, = det(M)^/^ |det 6*^/(9)1"^^^ cr^^^'_^ ^(„^)((ip) dv,. 

Using the co-area formulas (|2.18p - (|2.19p . and the expressions of symplectic Gram 
matrices (|2.10p - (|2.1ip . we obtain the following expressions of the phase space mea- 
sures: 

(i) The phase space measure on S{,pj (^jPz) can be identified with the conditional 
measure defined in ([2.16^ : 

(2-21) (Ts;.p^(2,p,)(rf9rfp) = %(g)_^,p5(q,p)_p,)(dgdp), 

while the phase space measure on 'E,^_-u^(z,Vz) is related to the corresponding 
conditional measure as 

(2.22) crSf,^^{z,v,}{dqdp) = det(G'M) '^(5(q)-^,t,5(q,p)-„,) (dg rfp)- 

(ii) The phase space measures are given by the product of surface measures: 
(2-23) as,.,^iz,p.){dqdp) = <;',,(p,)(rfp)4{,)(dq), 

and 

(2-24) ^s,,, (d^rfp) = .,(„^)(dp)4{,)(dg). 

Equations (|2.23p - p.24p are a consequence of the fact that 

^(^(9)-z,P5(g,p)-p,)(c^'7rfp) = Spi{q,p)-pAdp)Si{q)-z{dq) 

(and a similar relation for v^). 

2.3.3. Divergence formulas. We end this section with an important formula, which 
is used to show the invariance of the canonical measure in the proof of Proposi- 
tion O 

Proposition 2.2 (Divergence theorem in phase space). Consider the Poisson 
bracket {•, defined by p.l3p . and an open neighborhood O of Se(C) C 
where T is invertible. Then for any smooth test functions ipi,(p2 ■ — > M with 
compact support in O, 



(2.25) 



I "^2}^ rfcrs-(^) = 0. 

Sb(C) 
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The divergence formula ()2.25p can be proved using Darboux's theorem and in- 
ternal coordinates, or directly using the co-area formula (see Section 3.3 in |35|). 

We will also need the classical divergence formula on affinc spaces (see for in- 
stance Section 3.3 in [35]): for a fixed q G R'^^, for any compactly supported smooth 
vector field 0(g,_p) € R^^, 



(2.26) 



/ divp(PM(g)0(g,p))<" ,(.,)(dp) = 0. 

3. Constrained Langevin processes and sampling 



We first give some properties of the constrained Langevin equation (CL) in Sec- 
tion [2111 then propose some numerical schemes to discretize it in Section [321 ^nd 
finally consider the overdamped limit in Section [ 



3.1. Properties of the dynamics. We consider the dynamics (CL): 

dqt = Ar^ptdt, 

dpt - -VV^(gt) dt - jiqt)M-^pt dt + a{qt) dWt + V^iqt) dXt, 

By differentiating with respect to time the constraint ^{qt) = z, the Lagrange 
multipliers can be computed explicitly (see for instance Section 3.3 in [35]): 

dXt = -Gjliqt)[ncss,^{0{M-'pt,M-'pt) dt 

+ V^iqtfM-' ( - VV{qt) dt - -f{qt)M-'pt dt + a(qt) dWt) 

(3.1) = f^lM.Pt) dt + G-^}{qt)Va<ltfM-^ {jiqt)M-'pt dt - aiqt)dWt) , 
where the constraining force f^^^ G M™ is defined as: 

(3.2) C(9,p) = G-,^iq)vaqfAr'yv{q) - G-,liq)llcss,iOiM-'p,M-'p). 

Thus, using the fact that PM{q)'^ M^^p — M^^p when p € T*ll{z), the dynam- 
ics (CL) can be recast in a more explicit form as 

' dqt = M^^pt dt, 



(3.3) 



dpt = -Vy(gt) dt + Ve(gt)/4'd(9t,Pt) dt - 7p('7t)M"Vt dt 
+ ap{qt)dWt, 



where wc introduced the notation (crp,7p) := {Pm cr, Pm J Pj'f)- The constraint 
therefore has two effects: (i) the matrices 7, a in the dissipation and fluctuation 
terms are replaced by their projected counterparts "fp,ap, and (ii) an orthogonal 
constraining force V^/^^^^j is introduced. 

The generator of this stochastic Langevin dynamics is the operator £■£ which 
appears in the Kolmogorov evolution equation: for {qt,pt) satisfying (CL) or (|3.3p . 
and for any smooth test function ip, 

jE{ipiquPt))^^CEiip){qt,Pt))- 

The expression of £s can be obtained using Ito calculus, as made precise in the 
following proposition. 
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Proposition 3.1. Consider either the effective momentum ()2.5p or the effective 
velocity p.4|) . denoted with the general constraints S = {S,,v^) or "E. = {S,,p^) 
(see (j2.6|) ). The solution of the constrained dynamics (CL) (or equivalently p.3p 
with an initial condition {qotPo) € '^siz^O)) belongs to Eh(z,0) = T*E(z), and the 
generator of this Markov process reads ( whatever the value of z) 

(3.4) CE = {-,H}^ + Ci'^\ 
where the fluctuation-dissipation part is 

with (crp,7p) defined in (|1.16p . Using the fluctuation- dissipation relation (|1.3p . the 
generator can be rewritten more compactly as 

(3.5) £1^" = i e^^divp (c"^^ 7P Vp 

Proof. Wc perform the computation in two steps: (i) We compute tlie generator of 
the Hamiltonian part of tlie constrained Langevin dynamics, which is (CL) in the 
case {a, 7) = (0, 0); (ii) we compute the generator of the "thermostat" part of (CL), 
which is an Ornstein-Uhlenbeck process on momentum variable (corresponding to 
the second equation in (CL) with V = 0). 

Let us first consider (i), with S = {£,,v^) (the case S = {£,,p^) being similar). 
Note that 

(3.6) {E,H} {q,p) = (Hcss,(0(M-ip,M-^)'^Ve((7)^M-iVl/((7) 
where the Hessian operator Hess is defined in p.ip . Now, (|2.1ip implies that 



(3.7) r^^ = 



G-J 



Besides. v^{qt,pt) ~ along a trajectory, since {qt,Pt) € r*S(z). Therefore, 

(3.8) y{q,p) e T*E(z), {E,H} {q,p) = (^C('?'J')' 

where the notation f^^^ is introduced in (|3.2p . Consider a test function cp 
K, and remark that 

(3.9) {^,S}Q =-a^Ve^Vp^, 

so that, for any a <£ R™, 

W,E}T~'{E,H} = -(C)^VC^Vp^. 

Finally, for ah {q,p) S T*Y.{z), 

W,H}^ {q,p) ^ -S/ViqfS/Ml^P) + /4d(9,P)'^VC(g)^Vp^(q,p) 

+ p'^M-'Wg^{q,p). 

The operator (|3.10p is the generator of the Hamiltonian part in (|3.3p . 

We turn to (ii). The diffusive part arises from the fluctuation term ap{qt) dWt 
and its expression 



idiVp^PMCT'T^^^MVp • ) 
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is obtained directly from the standard Ito calculus. Similarly, the dissipation oper- 
ator is 

The addition of these two contributions gives the expression of O^^™. □ 

With the expression (|3.4p of the generator at hand, it is easily checked that the 
process (CL) satisfies the following equilibrium properties: 



Proposition 3.2. When the fluctuation-dissipation relation (|1.3|) holds, the con- 
strained Langevin dynamics (CL) on T*I](z) admits the Boltzmann- Gibbs distribu- 
tion (jl.9[) as a stationary measure, and is reversible up to momentum reversal with 
respect to (|1.9p ; // Law((7o,Po) = /^t*s(z)j then, for any T > 0, 

Law{qt,pt;0 <t<T)= Law((jT-t, ~Pt-uO <t<T). 

Moreover, if Pm{<1)iPm{<1)^ is everywhere strictly positive in the sense of sym- 
metric matrices on T*T,{z), then the process (CL) is ergodic: for any smooth test 
function tp. 



^lim 4 / f{qt,Pt)dt^ [ (pdfi' 



i-T 

'■fT*s(z) a.s. 



Proof. The stationarity and reversibility properties follow from the following de- 
tailed balance condition up to momentum reversal (see for instance Section 2.2 
in [5S]): for any test functions ipi, ip2, 



e 



(3.11) / m CB.{ip2) dflT'^iz) = {V2 S) Cb.[<PiO S)d^J.T*Y.(z), 

JT*T.(z) JT'1Z{z) 

where S : {q,p) {q,—p) is the momentum flip. In view of the expression p.4p 
of the generator, proving (j3.1ip amounts to proving this property for the operators 

For the Hamiltoniaii part {.,H}~, the expression (j3.10p yields 

o S, H}^ iq,p) = - H}^ [q, -p) = - H}^ {S{q,p)), 

which states the time symmetry under momentum reversal of the Hamiltonian part 
of the equations of motion (CL). On the other hand, 

1 

so that 

e-P" (^2 o S) o S, H}^ = - {c-P" if 2 Wi,H}^) ° S 

and the divergence formula (|2.25p yields the balance condition p. lip for the Hamil- 
tonian part, in view of the invariance of the distribution (Jt*'s:(z) under the momen- 
tum flip S. 

For the thermostat part, it is easily checked, that 

c'i'^iipo s) = ci'"\ip)o s 
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for any smooth test function tp, so that the detailed balance condition up to mo- 
mentum reversal p. lip follows from the following more general detailed balance 
condition, in the case ~Q (msj „j(z.-uj) being defined in (|2.15p ): 

(3.12) / ^1 £i^-(^2)dMs, / v'2 4'"^(v'i)rfMs, 

It is interesting to prove p.l2p for a general Wz G M since it will be used in the 
proof of Theorem 15.31 below. Consider the divergence formula (|2.26p in the afhne 
space for the variable p (the position q being fixed), with 

</. = 7n'}Vp((^2)e-''^(^i. 

After integration in g, using the formula p.Sp for O^^™ and (|2.24p . an expression 
symmetric in (1^91,(^52) is obtained: 

/ '/'I ^i"°(</'2)rf/ii;; „ fz,«,) = - / Vl(piPAaPZVpip2d^i^^ (^;,^^^), 

hence the detailed balance condition p.l2p . 

Ergodicity is a consequence of the hypo-ellipticity of the operator Ce on T*Y.(z) 
(Hormander's criterion is satisfied, see [27]), which is itself a consequence of the fact 
that Pm (9)7^^/(9)"^ is strictly positive on each T*S(z). The proof can be carried 
out using local coordinates and the results from [SD]. □ 

Remark 3.3 (Infinite stiffness limit) . We have considered in this section the Langevin 
dynamics (|3.3p with constraints rigidly imposed by a projection onto the submani- 
fold T*S(z). This dynamics samples the canonical distribution (|1.9p /iT*s(z) {dq dp) = 
Z~Q e~l^^^i'P) aT'T,(z) {dq dp), with constraints on both positions and momenta. The 
marginal on positions of this distribution is, in view of p.23p . proportional to 

e-^^^'^^a^^ydq). 

This is what we call in the following rigidly imposed constraints. The canonical 
distribution (|1.9p with rigid constraints is naturally associated to the rigid free 
energy p.l3p (and this is what justifies the qualification "rigid"), since, we recall, 

F,'^^{z) = -^In f c-^Hi^^P^ {dq dp) . 

Another way to impose some constraints on a system is to add a penalization 
term. In our context, this could be done by changing the potential energy V to 

VM = V{q) + -\i{q)-z\\ 

£ 

It is easy to check that, in the limit e — > (infinite stiffness limit), the canonical 
measure associated to this potential is the canonical distribution (|l.ip with positions 
conditioned by ^(g) = z. This distribution is proportional to c~I^^'^i^p) 6^(^g^_z{dq) dp 
and its marginal on positions is proportional to 

e^^^(')55(,)_,(dg). 

This is what we call in the following softly imposed constraints. The canonical 
distribution with soft constraints is naturally associated with the standard free 
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energy, since, we recall. 



F{z) = - ^ In / e-^^(«'P) <5^(g)-.(d<z) dp. 

P JS(z)xR3" 



Note that, in view of p.lSp . the marginal on positions for softly imposed con- 
straints can be written in terms of rigidly imposed constraints through a modifica- 
tion of the potential: 

e-Z^^t^'^J^t^j.^Cdg) = e-^(^+^-)(^)a^{,)(d<z) 

where 

(3.13) yfi,(g) = ^ln(detGM(<z)), 

is sometimes called the Fixman corrector (see [H]). Thus, if {qt,Pt) satisfies p.3p 
with the modified potential V + Vflx then qt samples (in the longtime limit) the 
probability measure proportional to e~^^^''^(5^(g)_2(d(7), and we thus refer to this 
dynamics as the softly constrained Langcvin dynamics. 

These concepts will be used in Section HTT] to describe the computation of free 
energy difFerenccs for systems with molecular constraints. 

Finally, let us mention that the infinite stiffness limit e — > of the Langcvin 
dynamics (|1.2p with the potential 14 is not (except for very specific forms of con- 
straints) the softly constrained Langcvin dynamics, as one would expect. We refer 
to [32] for example, where it is shown that adiabatic effective potentials (derived 
from the conservation of the ratio of energy over frequency of fast modes) are re- 
quired to describe the limiting dynamics. However, a formal argument based on 
"over-damping" the fast modes indeed leads to the softly constrained Langevin dy- 
namics. □ 

3.2. Numerical implementation. We consider in this section a numerical scheme 
based on a splitting of the Langevin dynamics (CL) into a Hamiltonian part (Sec- 
tion 13.2. 1|) and a fiuctuation-dissipation part acting only on the momentum (Sec- 
tion [3221) • Such a splitting is standard for unconstrained systems, but other split- 
ting strategies for the Langevin equation can be considered as well (see [38 l [39]). 

For simplicity, we restrict ourselves to constant matrices 7 and a. Generalizations 
to position dependent matrices arc straightforward. 

The Hamiltonian part of the Langevin dynamics (CL) (namely (CL) with (cr, 7) = 
(0, 0)) is discretized using a velocity- Vcrlct scheme with constraints, which yields p.l7[) 
below. The fiuctuation-dissipation part on momentum variable in (CL) is the fol- 
lowing Ornstein-Uhlenbeck process (for a fixed given q € S(z)): 

(dpt = ~-iM-^pt dt + adWt+ V^(<7) dApu, 
\V^[q)M-^Pt = 0, (Cp) 

which can be rewritten as (see (|3.3I) ) 

dpt = --tp{q)hr^pt dt + ap{q) dWt- 

This equation can be explicitly integrated on [0, t] to obtain: 

(3.15) =e-*^^(«)^^ Vo+ / c~^'~'^^^^'^^^^'' ap{q)dW,. 



(3.14) 
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However, the matrix exponential e"*'''^^''-'^^ may be difficult to compute in prac- 
tice (except for certain choices of 7 and M , see the discussion at the end of Sec- 
tion 13.2. 2p . Instead of performing an exact integration, (|3.14p can be discretizcd 
using a midpoint Eulcr scheme, which yields p.l6p and (|3.18p below. 

The numerical scheme we investigate, termed midpoint Eulcr- Vcrlet-midpoint 
Eulcr splitting, is therefore the following: 



(3.16) 



At 



Ve(g")^A/"^p"+^/^ = 0, 



(3.17) 



At 



n+1/2 ^ pn+l/4 \/V{q") + Ve(g") A"+l/^ 



qn+1 =(^« + AO./-ip"+^/^ 
C(g«+i) = z, 

pn+3/4 ^ pn+l/2 _ ^ V^(g"+1) A"+3/4^ 

Ve((7"+i)^M-ip"+3/4 = 0, 



(3.18) 



+ Ve(<?"+^)A"+i, 

y^(qn+l)T^j-lpn+l ^ 



— (T y 
2 



-1/2 
(Cp) 



where (^")n>o and (^"^^^^)n>o are sequences of independently and identically dis- 
tributed (i.i.d.) Gaussian random variables of mean and covariancc matrix IdsTv- 
Note that when 7 = and a ~ 0, the scheme (|3.16p - p.l7l) - p.l8p becomes 
deterministic, and reduces to p.l7p . which is a scheme for the deterministic Hamil- 
tonian equations of motion with position constraints ^(g) = z. The latter scheme 
is referred to as the "Hamiltonian scheme p.l7p " below. 



3.2.1. Comments on the Hamiltonian scheme (j3.17p . The Hamiltonian part p.l7p 
of the scheme, often called 'RATTLE' in the literature, is an explicit integrator, 
and is a modification of the classical 'SHAKE' algorithm (see Chapter VII. 1 in 
[22], or Chapter 7 in [32] for more precisions and historical references). In (|3.17p . 
)n+i/2 g j^m g^j^g Lagrange multipliers associated with the position constraints 
(Cq), and X^+^^^ g M™ arc the Lagrange multipliers associated with the velocity 
constraints (Cp). The nonlinear constraints {Cq) are typically enforced using New- 
ton's algorithm. In p.l7p . the (linear) momentum projection (Cp) is always well 
defined since we assumed that the Gram matrix GmIq) is invertible. On the other 
hand, the nonlinear projection used to enforce the position constraints ^(9"^^) = z 
is in general well defined only on a subset of phase space. 

Definition 3.4 (Domain DAt)- The domain D^t C T*S(z) is defined as the 
set of configurations {q"' ,p'^~^^^^) S r*S(z) such that there is a unique solution 
(g"+i,p"+3/4) verifying (|?Tf| . 
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Solving the position constraints (Cq) consists in projecting onto T,{z) a point in 
a A^-neighborhood of g". Thus, by the imphcit function theorem, the domain Dai 
verifies: 



lim Da* 

At->0 



It may happen that there is no solution if the time-step is too large, and, even 
for small time-steps, that several projections exist, see for instance Example 2 in 
Chapter 7 of [22] ■ In practice, Dai can be chosen to be the set of such 
that the Newton algorithm enforcing the constraints (Cq) has converged within a 
given precision threshold and a limited number of iterations. 

As for the Verlet scheme in the unconstrained case, the associated numerical flow 
shares two important qualitative properties with the exact flow: It is time reversible 
and symplectic (see |33|). This implies quasi-conservation of energy, in the sense 
that energy is conserved within a given precision threshold over exponentially long 
times, see [22ll32]. 



3.2.2. Comments on the fluctuation- dissipation part (|3.16|) and (|3.18p . The new 



momentum p 



,"+1/4 



€ T*S(z) in (or in (jXTg)) ) may be obtained by 



first integrating the unconstrained dynamics with a midpoint scheme, and then 
computing the Lagrange multiplier A"+^/* (or A"+^) by solving the following linear 
system implied by the constraints (Cp): 



Ve(g")^ Id 



At 



A sufficient criteria for stability is 




At 



V^(g")A"+i/* 



= 0. 



-7 < M. 



Besides, it can be checked (see Sections 2.3.2 and 3.3.5 in [33) that the Markov 
chain induced by the fluctuation-dissipation part of the scheme p.l6p (or p.lSp ) 
verifies a detailed balance equation (both in the plain sense and up to momentum 
reversal) with respect to the stationary measure K^,j^f^^^{dp). The latter is defined 
as the kinetic probability distribution 

(ZN-m)/2 



(3.19) 



(dp) 



27r/ 



exp 



-13 



p^M^^p 



<^T'i:(z)^dp), 



and is the marginal in the p-variable of the canonical distribution fJ'T-'j:(z){d.q dp) 
conditioned by a given q G ^{z). Moreover, if jp := Pm J Pm is strictly positive 
in the sense of symmetric linear transformations of T*T,{z), then the Markov chain 
on momentum variable induced by p.l6p (or p.lSp ') alone is ergodic with respect 
to K^^^^f^^^idp). 



Finally, an important simplification occurs in the integration of p.l4p in the 
special case when 7 and M are equal up to a multiplicative constant (so that 
7M~^ is proportional to identity). Indeed in this case the equality (7p(g)A/^^) = 
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Pa/ (9) (7-^^ holds for any n > 0, and p.lSp simplifies to 

(3.20) pt = PM{q) [e-'-<'''"p, + j\-^'-'^''^"'\dw)j . 

The numerical integration of (|3.14p can thus be carried out in two steps: (i) 
exactly integrating p.l4p without constraint, and then (ii) projecting the result 
onto T*Y,{z). 

3.2.3. Metropolis- Hastings correction. Usually, the invariant probability distribu- 
tion sampled by the solution of a numerical scheme is biased by the time discretiza- 
tion. Relying on (i) the time symmetry (up to momentum reversal) and (ii) the 
preservation of the phase space measure ut* e(z) {dq dp) by the solution of the RAT- 
TLE scheme (|3.17p . it is possible to eliminate the time discretization error in the 
splitting scheme p.l6p - p.l7p - p.l8|) by resorting to a Generalized Hybrid Monte 
Carlo algorithm. 

Algorithm 3.5 (GHMC with constraints). Consider an initial configuration {q^ ,p^) € 
T*'S{z), and a sequence {Q", G^~^^^^)n>o of independently and identically distributed 
standard Gaussian vectors. Iterate on n > 0: 

(1) Evolve the momentum according to the midpoint Euler scheme (|3.16p . and 
compute the energy E" = H{q''^ ,p^^^^/^) of the new configuration; 

(2) Integrate the Hamiltonian part according to the RATTLE scheme (j3.17p . 
denote (q«+i,p"+3/4) the resulting state, and set E'^+^ = H {q"+^ ,p'^+^ ' . 

(3) Accept the proposal ((7"+^,^"+'^/^) := (5"+-',p"+'^/^) with probability 

min(e-'3(^"+'-^"),l). 

Otherwise, reject and flip the momentum: (g""*'^,^"''''^/'*) = (g", — p"''""'^/^). 

(4) Evolve the momentum according to the midpoint Euler scheme (|3.18p . 

By construction, the GHMC algorithm with constraints leaves invariant the equi- 
librium distribution p,T*T,(z){dqdp) (see Section 3.3.5 in |35|). 

To understand the momentum reversal required upon rejection, it is useful to 
write more explicitly the Markov chain as the composition of a Metropolis-Hastings 
part, where the proposal is obtained by a RATTLE step followed by a momentum 
reversal (the latter operation is needed to ensure the symmetry of the proposition), 
which is then accepted or rejected; and another momentum reversal (which leaves 
invariant the targeted probability measure p-T-''S(z)idq dp)) . When the proposal is 
accepted, the two momentum reversals cancel out each other. On the other hand, 
when the proposal is rejected, momenta are actually reversed. See |351 Section 2.1.4] 
for more background on generalized Metropolis-Hastings algorithms. 

In the above, we implicitly assume that the RATTLE scheme p.l7|) is every- 
where well defined. In practice however, it is necessary to modify Algorithm 13.51 bv 
restricting the sampled configurations to I?At- This can be achieved by introducing 
additional tests in steps (1), (2) and (4), and rejecting the states that have gone 
outside the set D^t C T*Y,{z) where the position constraint (Cq) is well defined. 
By doing so, the global algorithm has an invariant equilibrium distribution given 
by fJ-T*'S{z){dq dp) conditioned on the set of states i^At- This invariant distribution 
can be written explicitly as follows: 

(3.21) y^e-^"^'''Ph^q^pj^o^^aT*nMd<ldp). 
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Alternatively, the rejection tests in steps (1), (2) and (4) of Algorithm 13.51 can 
be performed with a cut-off parameter Rai > on the momentum variable, cho- 
sen so that the position constraint (Cg) in (j3.17p is everywhere well defined when 
^p^M~^p < i?At- This can be achieved when there exists i?At > small enough 
so that i;(z) X {^p'^M^^p < RAt} C Dai C T*Y.{z). Since this is useful for later 
purposes (see the discussion at the end of Section [4.3p . we provide a rough estimate 
of RAt in terms of At, assuming for simplicity that T,{z) is compact. First, by the 
implicit function theorem, there exists a > such that, for all q G S(z) and Sq 
with norm \\5q\\ < a, there is a unique A G M'" satisfying 

^{q + Ar\dq + Vaq)X))^z. 

Therefore, there exists a > small enough such that, when |b"+i/^|| < a/At, 
the RATTLE scheme in p.l7|) is well defined, namely there exists a unique g""*"^ 
satisfying the constraint (C,). This shows that 

(3.22) RAt > AAt-"^ 

for some ^ > 0. 

The invariant probability distribution of the Markov chain generated by GHMC 
with the additional rejection steps ensuring ^p^M~^p < Rai-, is given by (|3.2ip . 
and actually reads 

^z,0,At ^ 

Its marginal distribution in the position variable is then exactly given by: 

j-e-^^^^^al,ydq). 

This is also the marginal distribution in the position variable of ^t-'T.(z)- Note 
however, that if RAt is too small, only small momenta will be sampled in step (1) 
of Algorithm l3.51 and the correlation time of the sampling will be large. In practice, 
the threshold RAt should be tuned in preliminary computations so that: (i) RAt 
is small enough so that the maximal number -/Vmax of iterations for the Newton 
algorithm used to enforce (C^) in (|3.17p is never reached; (ii) i?At is large enough 
so that the correlation time of the sampling is as small as possible. 

Let us end this section with a warning: It is now known that the correction of 
the bias in discretizations of the Langevin dynamics by a Metropolization of the 
scheme may reduce the efficiency of the sampling, see for instance [1]. 

3.3. Exact sampling on a submanifold with overdamped dynamics. Con- 
strained ovcrdamped Langevin processes (or Brownian dynamics) are solutions of 
the stochastic differential equation (see also [33 H]) 

(3 23) , ---VV{qt)dt+^dWt + ^aqt)d\t, 



where At is an adapted stochastic process. Equivalently, p.23p can be rewritten in 
the Stratonovitch form as 

dqt = -P{qt)VV{qt) dt + ^J^Piqt) o dWt, 
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where o denotes the Stratonovitch integration, and P is the projector defined 
by (fTT]) with the choice M = Id: 

(3.24) Piq) = Id - VC(g) (g)Ve(g)^, G(g) = Ve(g)^V^((7). 

It can be shown that (|3.23p verifies the detailed balance condition for (and is ergodic 
with respect to) the invariant distribution 

(3.25) Z7ie-''^(^)ai,%)(dg), 

which is the marginal in the (/-variable of the canonical distribution with con- 
straints ()1.9p for the choice M = Id. It is easy to generalize all our results to 
scalar products associated with a general symmetric definite positive matrix M 
upon considering (|3.32p . sec Remark |3 . 71 below. 

The constrained overdamped Langevin process p.23p may be obtained from a 
scaling limit of the constrained Langevin dynamics (CL) (in the limit when either 
the mass goes to zero, or the damping 7 goes to infinity), see Propositions 2.14 
and 2.15 in [55] . 

Likewise, at the discrete level, an Euler-Maruyama discretization of the over- 
damped process (|3.23[) can be obtained as a particular case of the numerical dis- 
cretization (|3.16p - (|3.17p - (|3.18p for the Langevin equation (CL), yielding a Markov 
chain (g")n>o on positions. The condition that the mass goes to is replaced by 
the condition that the mass is proportional to the time-step. This is the content of 
the following proposition. 

Proposition 3.6. Suppose that the following relation is satisfied: 

(3.26) ^7 = M = ^Id. 

With a slight abuse of notation, the mass matrix and the friction matrix are rewrit- 
ten as Mid and 7 Id with M, 7 e M. Then the splitting scheme (|XTB)) - (PT7|) - ((XT5)) 
yields the following Euler scheme for the overdamped Langevin constrained dynam- 
ics (p:23)) ; 




(3.27) J- - ^tvv{qn + J'^-fg- + ys,{q-)Kl\ 



where (^")„>o are independent and identically distributed centered and normalized 
Gaussian variables, and (A"^)„>i are the Lagrange multipliers associated with the 
constraints {^{q") = z)n>i. Moreover, the Lagrange multipliers in p.l7|) verify: 

(3.28) 2A"+^/2 ^ G^i(g")(ve(g")^ - q") + AiVe(g")^VV^(q") 



(3.29) = x:+' + J^G-\q")V^{q"fg- 



as well as 



(3.30) 2A"+3/4 = G-i((7"+i)(ve(g"+')^ (<z" - + A<Ve((?"+')^Vy((z"+i) 
where G is defined in (j3.24p . 
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Proof. Irrespective of p", the choice p.26p in the scherae p.l6p - p.l7p - p.l8|) leads 
to 

V o Z 

where A"+^/^ is associated with the constraints V^((7")"^p"+^/'* = 0. This gives 

where A"+-'^/^ is such that ^(g"^^) = z. The fluctuation-dissipation relation (|1.3p 
can be reformulated in this context as 

J, 2 4 
=^7=^ Id, 

and the scheme p.27p is recovered by taking the associated Lagrange multiplier 
equal to A"j'^ = A"+i/'' + 2A"+i/2. Finally, remarking that Gm = -^tG and 
computing exphcitly A"+i/2 and A^+^Z" in (pTfl) yields (|3:28l) - (|3:29)) - (|330l) . □ 

This point of view allows to construct a Metropolis correction to the Eulcr 
scheme p.27p . using the Generalized Hybrid Monte Carlo scheme f Algorithm 13. 5p 
with the time-step chosen according to p.26p . In this way, assuming that the posi- 
tion constraint (C<j) in p.l7p is everywhere well defined, we obtain a Markov chain 
('?")n>o discretizing the overdamped dynamics (j3.27p which exactly samples the in- 
variant distribution (|3.25p . Deriving such a Metropolis- Hastings correction to the 
Euler scheme p.27p without resorting to phase-space dynamics does not seem to 
be natural. 



Remark 3.7 (Discrete overdamped limit). Proposition 13 .61 can be seen as a discrete 
version of the zero-mass limit of the Langevin dynamics. It is also possible to obtain 
a discrete version of the overdamped limit (7 — > 00) of the Langevin dynamics by 
assuming that the parameters satisfy the relation 

At 

—7 = M cx Id, 

which is less restrictive than (|3.26l) . Equation (|3.27l) is then obtained with At 
replaced by 

.oo,N . 2At 

(^■31) ^^=2M=— • 

In this case, the effective discretization time-step As for the overdamped Langevin 
dynamics is thus different from the time-step At originally used for the discretiza- 
tion of the Langevin dynamics. This is reminiscent of the fact that the over- 
damped limit (at the continuous level) of the Langevin dynamics requires a change 
of timescale to obtain the overdamped Langevin dynamics. 
Note also that in the more general case 
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where M is not supposed to be proportional to the identity, the following numerical 
scheme is obtained: 



where As = At"^ /2. This is a discretization of the overdampcd dynamics 



(3.32) 



dq, = -~M-'^yV{qs) ds + -^A'r^/^dWs + A/"^V^(g^) d\s 



which is a generalization of p.23p to a scalar product on ^(z) induced by a general 
positive definite mass matrix M . □ 

4. Thermodynamic integration with constrained Langevin dynamics 

In this section, we focus on the computation of the gradient of the rigid free 
energy (|1.13p 

^^r^d {z) = ~-An( e-^^(^^^') ar-. [dq dp) , 

P Jt*T.(z) 

using a numerical discretization of the constrained Langevin process (CL). 

As explained in the introduction, we may indeed concentrate on the computation 
of the rigid free energy (|1.13p , since the standard free energy (jl.l2p can be computed 
from the latter one using (|1.14p . The relation (|1.14p can be proved with the co-area 
formula (|2.18l) . Indeed, the free energy defined in (|1.12[) can be rewritten as (where 
C denotes a constant which may vary from line to line): 

(4.1) F{z)^-j\n( e-^«('^^''%,)_,(dg)dp 

P JS{z)xK3JV 



iln / e-^^(^)(detGM(g))-^/^ait.)(dg) + C 



iln / e-^"^^'PHdetGM{q))-'^'cTT'Ei.){dqdp) + C 



f3 



T*E(z) 



hence 

(4.2) Fiz) - F^^^iz) - iln / (det GM)-^/'d/.^*s(.) + C, 

P JT*E(z) 

where surface measures arc defined in Section 12.31 Note that the rigid free energy 
-^rgd indeed depends explicitly on the mass matrix since 

(4.3) F//,(z) = -iln / e-™a^{,)(dg) + C. 

This section is organized as follows. First, we show how systems with molecular 
constraints and systems with constrained values of the reaction coordinate can 
be treated in a unified framework (Section 14. ip . We then relate the Lagrange 
multipliers arising in the constrained Langevin dynamics, and the gradient of the 
rigid free energy (the so-called mean force) in Section lT^ We consider the numerical 
computation of the mean force in Section 14.31 where we prove consistency results 
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for the corresponding approximation formulas. Finally, some numerical results on 
a model system illustrate the approach in Section 

4.1. Molecular constraints. We discuss here how to generalize all the computa- 
tions to systems with molecular constraints, generalizing thereby some results of [8|. 
Without loss of generality, we consider rigidly imposed molecular constraints, see 
Remark 14.11 below for a discussion of this assumption. This section can be consid- 
ered as independent of the remainder of the paper and may therefore be omitted 
in a first reading. 

In practice, many systems are subject to molecular constraints, such as fixed 
lengths for covalent bonds, or fixed angles between covalcnt bonds. The reader 
is referred to [42] for practical aspects related to the simulation of molecular con- 
straints. In the context of free energy computations, two types of constraints are 
therefore considered: first, the molecular constraints, 

Cmc(g) = {£.mcs{q), ■ ■ • ,Cmc,rn:(g)) = 0, 

for m < 3N, and second, the reaction coordinates denoted in this section by ^rc ■ 
M'^^ — > M™, with m + m < 3N. The submanifold of molecular constraints is 
denoted by 

Smc={geK3^ I ^„,(g)=0}, 

and the submanifold associated with the reaction coordinates by 

Src(^rc) = {geK3^|^rc(g)=Zrc}. 

It is assumed that the full Gram matrix: 

GZ''" V(6nc,^rc)^M-lV(f„,c,6c) S R("+™)x(-+™) 

is everywhere invertible on H Sicl^rc)- Likewise, we denote 
and 

GZ yCcM-'VUc e M"^™. 
Assuming rigid mechanical constraints on the molecular constraints (sec Re- 
mark ET] below), we arc led to considering the canonical distribution 

(4.4) /.T*s.„(d<zdp) = -Le-^«(9.P)5^_(^)_p^_(p^)(dgdp) 

= ^c-^"^'''P^aT'E„Jdqdp), 

to describe systems with molecular constraints at a fixed temperature. The mea- 
sure ctt'S^c denotes the phase space measure on T*I]nic, equal by (|2.2ip to the con- 
ditional measure ^£,^^{q),pi;^^{q,p){dq dp) associated with the constraints {^mc{q) = 
Oi?'4mc('?j-P) = 0), where pj„,„ is the cfFcctive momentum (|2.5p associated with ^,„c. 

Remark 4.1 (On the choice of the distribution (|4.4p '). The distribution //t'E„c 
in (|4.4p is obtained by constraining rigidly ^mc{q) to 0, and not 'softly' (in which case 
^imc{q),pi;„^^{p,q)(.dq dp) would be replaced by 5^^^(^qj(dq) dp, see also Remark 13. 3p . 
As explained in Section [31 the distribution (j4.4[) is the equilibrium distribution 
of a Langcvin process (thermostated Hamiltonian dynamics) with rigid position 
constraints ^mc((?) = 0. Choosing whether constraints should be soft or rigid is 
a modeling choice, and there is no clear consensus on this issue in the current 
literature. 
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Note however that it is possible to rewrite the remainder of this section by consid- 
ering the softly constrained potential rather than the rigidly constrained potential, 
up to an appropriate modification of (|4.6p below, with the help of some Fixman 
corrective potential. □ 

By associativity of the conditioning of measures, the distribution /iT»s„„ condi- 
tioned by a value of the reaction coordinates ^rc('?) = z^c is given, up to a normal- 
izing factor, by: 

Therefore, considering the marginal probability distribution of the reaction coordi- 
nates Crc('?) leads to the following definition of the free energy associated with ^j.^: 

P JT'S„cn(Src(Zrc)xR3«) 

The conditional distribution can be decomposed as follows, using the co-area for- 
mulas (|2.18p - (|2.20p and the definition of effective momentum (|2.5p : 

^«mc(9),P5„, (9,p),6c(9)-^,c dp) 

^ {det G^mq)f'4'iljdp) idetG'^riq)y'^''^¥Mn^.Jdq)- 
Integrating out the momentum in the linear space T*'E^c with scalar product 
(pi,P2)a/-i = PiM~-^p2, the free energy can be rewritten as: 

fi'^ j^^^^^^^^^^J UetG-;'-(g)j '^^.c(..)nE„Jrf9) + (^. 

As a consequence, the free energy i^™'^ can be computed from the generalized rigid 
free energy: 

(4.5) ^^,'^^'*'(^.c, 0) = In / e-'^^(«'^')ar.(s„(.,.)nE„.)(dpdg), 

P JT*(S„{2rc)nS„,o) 

using the following formula, similar to (jl.l4p : 

F^\z,,)-F^^'\z,,,Q) 

(4.6) 1 /• JdctG£)^ 

P JT*(E„(2„)nE„,)(detG"/' ) ' 

In the above, A*T*(s„(zrc)ns„<;) defined similarly to ()1.9p . The case of molecular 
constraints can therefore be treated within the general framework considered in this 
paper, the sampling of the canonical measure AtT*(Src(2rc)ns„c) ^^"^ '^^^ computation 
of the rigid free energy (|4.5p being the problems at hand. 

Remark 4.2 (The ovcrdamped Langevin case). For systems with molecular con- 
straints, the measure sampled by the overdamped Langevin dynamics p.23p with 
C " (6nc, Crc) is the marginal distribution in the position variables of /^T*(s„(z„)nE„c) 
in the special case M = Id, namely 

The rigid free energy which is thus naturally computed with such a dynamics is 

F^f'izrcO) = -iln / e"^^(^)<(,^^),^„Jd<j), 

P JS,c(2rc)nE„c 
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and the actual free energy is thus recovered by a formula similar to (|4.6p . namely 

4.2. The mean force and the Lagrange multipliers. In this section, the av- 
erage of the constraining force (|3.2p is related to the gradient of the rigid free 
energy (|1.13p (or mean force). We also give a similar result for the following gen- 
eralized rigid free energy: 

(4.7) Ffgd(C) = -^lii / e-^^('^^'as,(c)(dgdp). 

Proposition 4.3. The constraining force /^^^ : T*T,{z) — > R™ defined in p.2p as 

f^^Mp) = Gll{q)Vi{qfAr^VV{q) - G^/(g)Hess,(0(M- V, A^" V), 
yields on average the rigid free energy derivative: 

(4.8) V,i^/g{i(z) = / C(9,p)/^T*s(.)(rfgrfp). 

JT*S(2) 

Moreover, for general constraints (|2.6p and t/ie associated generalized free energy (j4.7p , 
i/ie formula can be extended as follows: The generalized constraining force is 

(4.9) (C) :=r-i{S,i/}, 
w/iere 

r(q,p) = {S,S}(g,p) = VS^(g,p) JVS((7,p) 
is defined in \2.9\ , and the rigid mean force is 



(4.10) VcF=d(C) ^yJ^ (^^) ^"''''^'^^-(0' 

where Z(^= I c~^^da^^(^Q, and F^^((^) is defined in (|4.7p . When {q,p) veri- 
/les P4(g,p) = W5(g,p) = 0, then g"{q,p) = and f"{q,p) = f^^M^P)- 

Proof. Formulas (|4.9p and (|4.10p arc obtained directly by replacing (/? by c~^^ in 
Lemma [121 (see the Appendix). The fact that {f^{q,p),g"{q,p)) ~ (/i^dl^'P)' 0) 
in the tangential case (namely when P(_{q,p) = v^{q,p) = 0) is a consequence 
of (ESI). □ 



The following lemma gives a momentum-averaged version of the constraining 
force (a similar formula exists in the overdampcd case, sec Equations (4.8)-(4.9) 
in [9], for example). 

Lemma 4.4. The rigid mean force (|4.8p can be rewritten as: 

(4.11) y^F,^A{z)^ ( 7'U^)l^T'^(Mdqdp), 

JT*E(z) 

where 

(4.12) 7fgd(g) = G^/(<z)Ve(g)^M-Vy(q)-r'G^/(<z)Hcss,(0 : {M-'PM{q)) ■ 
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Proof. Consider the Gaussian distribution (o?p) defined in p.l9p : 



i^T;^(z)idp) 2 ) '^T;^{z)idp), 

which is the marginal distribution in the momentum variable of the canonical dis- 
tribution fiT*^{z)(,d<l dp), conditioned by a given q G Proving Lemma 14.41 
amounts to showing that the average of the constraining force with respect to 



K^{^^^^{dp) yields f^^^: 



/rgd (<?)=/ f^d{(l,P) i^T' s (z) (dp) . 
Jt-t.{z) 

First, we compute the covariance matrix 

C := cov(4{5.'(^.)) 

of the Gaussian distribution K^,^i^^-^{dp) . Since ^iT'-T.(z)idp) is a centered Gaussian 
distribution, C satisfies, for all pi,p2 € M^^, 

pIM-'CM-'p2 := / (P^M-Vi) {P^M''P2) 4C.)(rfp) 

= / (p^M-iPM(g)pi) (p^A/-iPM(<z)p2)4rs(,)(dp). 
Jt;^{z) 

Denoting (pi,p2)j\f-i ~ pj M~^p2, this yields 

PlM CM P2= {P,PM{q)Pl)M-^{P,PM{q)p2)M-i- ,^,(3N-m)/2 ^T-T.[z){dp) 

jT^uz) i^TT/py 

= r\PM{q)puPM{q)p2)M-^, 

so that 

(4.13) C = l3-^PM{q)M. 
This gives 

/ Hess,(0(Af-V, M-^p) 4's(.)(c?P) = /3-'Hess,(0 : (Af"iPM(g)) • 

Averaging p.2p over momenta thus leads to the desired result. □ 

Free energy derivatives can also be obtained from the Lagrange multipliers of the 
Langevin constrained process (CL). This is very useful in practice since it avoids the 
computation of second order derivatives of the reaction coordinates which appear in 
the expressions of f^^^ and f^^^ (see the discussion at the beginning of Section l4.3.2p : 

Theorem 4.5. Consider the rigidly constrained Langevin process solution of (CL), 
with associated Lagrange multipliers At. Assume that VC; G^/ and G are bounded 
functions on S(z); and^p is strictly positive on T*T,{z) (in the sense of symmetric 
matrices). Then, the almost sure convergence (jl.lSp claimed in the introduction 
holds: 

(4.14) hm 1 f d\t^WzF^^{z) a.s 
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A similar result holds for the 'Hamiltonian part ' of the Lagrange multipliers, defined 
by: 

(4.15) 

dA^'" = dXt + Glj'V^iqtfM-' {-j{qt)M-'pt dt + a{qt)dWt) = f^^MuPt) dt. 
Indeed, the following almost sure convergence holds: 

(4.16) lim i rdA^- = V.F,4{i(z) a.s. 

The estimator based on (|4.16p has a smaUcr variance than the estimator based 
on (|1.15p (or (|4.14p above) since only the bounded variation part is retained, and 
the martingale part due to the Brownian increments and the dissipation term are 
subtracted out. Similar results on variance reduction where obtained in the over- 
damped case in [9]. 

Proof. Recall the expression p.ip of the Lagrange multipliers, which can be de- 
composed as the sum of the constraining force, a dissipation term and a martingale 
(fluctuation) term: 

(4.17) dXt = f^lMuPt) dt + G-lV£,{qtfM-^ {^{qt)M-^pt dt ~ cT{qt)dWt) . 



The result follows from three facts. First, the process is ergodic with respect to 
the equilibrium distribution ^J-T''T.{z){dqdp) and averaging /^^^ yields the rigid free 
energy derivative in view of Proposition 14.31 This already shows (|4.16p . 

Second, the Gaussian distribution of fJ-T*'S{z){dqdp) with respect to momentum 
variables is centered, which yields: 

/ Glliq)Vaqf Ar'jiq)M~' p HT'^^,){dqdp) = 0. 
Jt*i:{z) 

Third, the variance of the martingale term can be uniformly bounded as 

2 



E 



^ Gll{qt)Ve{qt)M-^a{qt)dWt 



< ||Tr(G^/ve^A/-Va^A/'iveG^;) 



This implies the almost sure convergence 

T-S.H 

see for example Theorem 1.3.15 in [17]. □ 



;im i / G^^{qt)VaqtfM-'a{qt)dWt 

^ + 00 1 Jq 



The fact that averaging the Lagrange multiplier in (|4.14p indeed yields the mean 
force may not be intuitive. This is actually very much related to the cost interpre- 
tation of the Lagrange multipliers in optimization, see |351 Remark 3.29]. 



4.3. Numerical discretization of the mean force. Estimates of the mean force 
based on either (|4.8|) . (|4.1ip or ()4.16p can be obtained. 
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4.3.1. Averaging local rigid mean forces. Free energy derivatives can be computed 
by averaging /j-gdC?) or f^^{q,p) with respect to the distribution HT*s{z)idqdp), 
for instance using the estimators: 

k=Q 

or 

The functions /rg(j(g) and f^^^{q,p) may thus be caUed "rigid local mean forces". 

Note that using the momentum-averaged local mean force /j.^^ instead of the origi- 
nal fi|£i reduces the variance since the fluctuations of the momentum variable have 
been averaged out analytically. Table [1] below confirms this analysis, although the 
variance reduction appears to be small in our specific numerical experiment. 

Assuming the convergence of the constrained splitting scheme (|3.16l) - (|3.17p - 
p.lSp in the probability distribution senscjl to the limiting Langcvin process (CL), 
the convergence of these estimators to '^zF^^g^{z) is ensured, when taking first the 
limit with K = TVa* such that A^AtAt T, and then T oo. 

4.3.2. Averaging the Lagrange multipliers. Free energy derivatives can also be com- 
puted using the Lagrange multipliers of a Langevin constrained process according 
to (|1.15p or (|4.16p . This technique avoids the possibly cumbersome computation of 
second order derivatives HesSg(^) of the reaction coordinate, which appear in the 

expressions of /^^^ or /i.gd- Besides, the Lagrange multipliers are needed anyway 
for the numerical integration of the dynamics. 

The computation can be performed as before with a longtime simulation of the 
splitting scheme (|3.16p - (|3.17p - (|3.18p discretizing the Langevin process with con- 
straints. The following approximation formula can for instance be used: 

(4.18) ^.F^,iz) - ^ Y.{X'+'/' + A^-+^/^) 

where (A'"'+^/^, A'^+'^Z^) are the Lagrange multipliers in the Hamiltonian part (|3.17p . 
The consistency of this estimator is given by the following proposition. 

Proposition 4.6 (Consistency). The approximation formula ()4.18p is consistent. 
More precisely, the Lagrange multipliers (A"+i/2, A"+^/**) in ((37T6l) - ([3Tf|) - ((37T8l) are 
both equivalent when At — > the constraining force defined in p.2|) ; 

A"+^/' = C(9",P"+'/')^ + 0(At^), 

^n+3/4 ^ (^n+l^p„+l/2)^ ^ 0{At^). 

Moreover, the following second order consistency holds for the .sum of the Lagrange 
multipliers: 

(4.19) A-+1/2 + Xn+m = ^(/M (^„^p„+l/2) ^ (,"+l,p"+l/2)) + OiAt% 

■^This convergence is also called weak convergence in probability theory. The proof of conver- 
gence in the present case may be carried out using classical results, see e.g. [19| . 
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together with the variant: 

(4.20) + = ^(/M ^ (^n+1 ^ ^.+3/4 )^ ^ o(^^3)_ 

The variant (|4.20p . which involves positions and momenta at the beginning and 
at the end of the Hamihonian steps only, is used in (|4.22p below to estimate the 
time discretization error in the thermodynamic integration method based on the 
estimator (|4.16p . 

Proof. For sufBciently small time-steps At, the implicit function theorem ensures 
that the two projection steps associated with the nonlinear constraints in (|3.16p - 
(|3.17p - p.l8p have a unique smooth solution. A Taylor expansion with respect to At 
of the position constraints gives 

z = = £.iq" + AtM" V+^^^) 

= + AtV^(g")^M- V+'/' + ^Hessq„ (0(A/" 

+ f^D3„(0(Af-V+i/2, Af-V"+'/') + 0{At^), 

where D^{(^){x, y, z) £ R™ denotes the order 3 differential of ^ computed at q and 
evaluated with the vectors x,y, z G R'^^. We denote 

Then, the fact that z = £,{q"+^) = ^(9") aiid the identity 

yield the following expansion of A"+^/^ in terms of ((7",p"+^/^): 

" 2 o 

By time symmetry, the same computation holds for A"+'^/'*, starting from 
and by formally replacing At by —At. This can be double checked by Taylor ex- 
panding with respect to At the position constraints, as done above for A""'"^/^. It 
thus holds: 

A^ A/2 
" 2 6 

The sum of the multipliers therefore reads 

^„+l/2 ^ _^„+3/4 _ ;M (^„^p„-Kl/2) At _ (^„+l^p„+l/2) At 

= ^ |^„n+l/2(^„+l) _ ^„+l/2(^„)^ ^ Q(^^3) ^ 0(At3), 

which gives (j4.19p . Now, using the previous calculations, we remark that: 
1 = P^+^'Z^ + ^ Vy(g"+i) - ^ve(g"+^)C(<?"+\p"+i/2) + 0(At2). 
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Thus, it holds 

This gives the claimed second order consistency of the sum of the Lagrange multi- 
phers dOni)- □ 



Let us discuss the convergence of the approximation (j4.18p . Assuming again that 
the constrained splitting scheme p.l6p - p.l7p - p.l8p converges in the probabihty 
distribution sense to the limiting Langevin process (CL), the following convergence 
in probabihty distribution occurs when At and N^tAt — > T: 



(4.21) lim Law 

At-J-O 



This shows the convergence of the estimate (|4.18p of the mean force when taking 
first the limit At ^ Q and then T ^ oo. 



4.3.3. Estimates relying on the Metropolized scheme. When the scheme (|3.16p - 
p.l7p - p.l8p is complemented with a Metropolis step (see Algorithm 13. 5p . it is 
possible to prove a result on the longtime limit of trajectorial averages (i.e. letting 
first the number of iterations go to infinity, and then taking the limit At — > 0), 
upon assuming the irreducibility of the numerical scheme. 

Indeed, let us consider the Markov chain {q^ tP^) generated by the GHMC scheme 
in Algorithm 13.51 and assume (i) the irreducibility of the Markov chain, and (ii) 
that appropriate rejections outside the set Z^a* = S(z) x {^p^ M~^p < i?At} sltb 
made in the steps (l)-(2)-(4) of the algorithm. In particular, the projection steps 
associated with the nonlinear constraints in Step (2) of Algorithm 13.51 are well 
defined. 

Then, by ergodicity, an average of the analytic expression of the local rigid 
mean force /^g^ given in ()4.12p yields an estimate of the free energy without time 
discretization error: 



1™ ^EC('z')=V.F//,(z) a.s. 
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If f^^^ is used instead of /j-gdi then the mean force is computed with some expo- 
nentially small error: almost surely, 



M 



fi"d{Q,P)^J-T'S(z){dqdp) 



K^+oo K 



_ tJ-T'T,(z)(d<ldp) 



frgdil^P)fJ'T*j:{z)idqdp) 



tJ-T'i:{z)idqdp) 

T*T.(z) 
M t ^\ , m f,.-aAt 



e 



= V.F/^dW + 0(c 

for some a > 0. The error arising from replacing Dai with T*'E{z) is indeed 
exponentially small in view of p.22p (namely R^t > AAt~'^) and using the fact 
that the marginal distribution in the p-variable is Gaussian. 

Likewise, for estimates based on Lagrange multipliers, the following longtime 
averaging holds: almost surely, 
(4.22) 



frediliP)^^T'-nz)idq dp) 



fe=o /_ fJ.T'S{z)[dqdp) 

JDAt 

where we have used the estimate (|4.20|) on the Lagrange multipliers. The limit 
At — > is obtained by a dominated convergence argument: 

lim hm (A'=+i/2 + A'=+3/4)=V,i^*^(z) a.s. 

n=0 

Note that, due to the Metropolis correction in Algorithm l3.51 the time discretization 
error in the sampling of the invariant measure is removed. The only remaining time 
discretization errors come from (i) the approximation of the local mean force by 
the Lagrange multipliers (this is a second order error), and (ii) the integration 
domain being Da* instead of r*S(z) (as discussed above, this is an exponentially 
small error in At). In conclusion, the left-hand side of (|4.22p is an approximation 
of ^zF^fdi^) up to a O(At^) error term. 

4.3.4. Overdamped limit. Finally, let us emphasize that free energy derivatives can 
be computed with the estimator (|4.18p within the overdamped Langevin framework, 
using the scheme p.27p and the expressions ()3.28p - p.29p - p.30p of Proposition l3.6l 
Let us recall that the latter are equivalent to the scheme p.l6p - p.l7p - p.l8p with 
fluctuation-dissipation matrices satisfying ^7 = M = -^Id. This leads to the 
original free energy estimator (recall that, for the overdamped dynamics, R'^^ is 
equipped with the scalar product associated with the identity matrix): 

(4.23) ^zFli,{z) ^ Y^iX'^+'Z' + A'=+3/4)^ 

k=0 
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which can be seen as a variant of the variance reduced estimator proposed directly 
for the overdampcd scheme ()3.27|) in [9]: 



k=0 \ V / k=0 

The rigorous justification of the consistency of (|4.23p in the hmit At ^ follows 
from the results of [9]. See also Section r5.5.2l below for similar results. 

4.4. Numerical illustration. We consider a system composed of N particles in a 
2-dimensional periodic box of side length L, interacting through the purely repulsive 
WCA pair potential, which is a truncated Lennard- Jones potential: 

12 



VwcA(r) 



4e 



he if J' < ^o, 
if r > To, 



where r denotes the distance between two particles, s and a are two positive pa- 
rameters and tq = 2^/^(7. Among these particles, two (numbered 1 and 2 in the 
following) are designated to form a dimer while the others are solvent particles. 
Instead of the above WCA potential, the interaction potential between the two 
particles of the dimer is a double-well potential 



(4.24) Vsir) = h 



^ _ (r - rp - 



where h and w are two positive parameters. The total energy of the system is 
therefore, for q g (LT^^ with d = 2, 

V{q) ^Vs{\qi-q2\)+ E ^WCA(|g^ - 9^ |) + E E ^WCAd?* ~ 9^1). 

3<i<j<N i=l,2 3<j<N 

See [131 m] for instance for other computational studies using this model. 

The potential Vg exhibits two energy minima, one corresponding to the compact 
state where the length of the dimer is r = tq , and one corresponding to the stretched 
state where this length is r = tq + 2w. The energy barrier separating both states 
is h. The reaction coordinate used to describe the transition from the compact to 
the stretched state is the normalized bond length of the dimer molecule: 

(4.25) a,) - ^1^1^, 

2w 

where qi and q2 are the positions of the two particles forming the dimer. The 
compact state (resp. the stretched state) corresponds to the value z = (resp. 
z = 1) of the reaction coordinate. 

The inverse temperature is set to /? = 1, with = 100 particles {N — 2 solvent 
particles and the dimer) with solvent density p = {l — 2/N)a~'^ = 0.436, since there 
are N — 2 solvent particles in a square box of side length L — a\/N with a — 1.5. 
The parameters describing the WCA interactions are set to cr = 1 and e = 1, and 
the additional parameters for the dimer are w = 2 and h = 2. 

For this system, M ~ Id and |V^| is constant, so that the rigid free energy 
F^^{z) is equal to the free energy F{z). 

The mean force is estimated at the values Zi = Zmin + iAz, with Zmin = —0.2, 
Zmax ~ 1-2 and Az — 0.014, by ergodic averages obtained with the projected 
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Reaction coordinate 



Figure 1. Top: Estimated mean force. Bottom: Corresponding 
free energy profile. 



dynamics with Metropolis correction (Algorithm [XU where in the simple case con- 
sidered here, the fluctuation-dissipation part can be integrated exactly). For each 
value of z, we integrate the dynamics on a time T = 2 x 10^ with a step size 
At = 0.02, using a scalar friction coefficient 7 = f . 

The resulting mean force profile is presented in Figure [U together with the 
associated free energy profile. Figure [5] compares the analytical constraining force 
/rgd(9"'P") ''^^'^ Lagrange multipliers, see Proposition 14.61 In Figure[2l the x- 
axis represents the blocks of 10^ simulation steps, concatenated for the 101 different 
values oizi. It can be seen that the difference between /rgd('Z"iP") and the Lagrange 
multipliers is small in any cases, though somewhat larger for the lowest values of ^. 
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-200^ ^ ^ ^ 1 ^ ^ ^ 1 ^ ^ ^ ^ ^ ^ ^ 

0.0 2.5e+06 5.0e+06 7.5e+06 1.0e+07 

Iteration index 





Figure 2. Top: The constraining force /rgd('7",p") (pale line), 
and the difference between the constraining force and its estimate 
from the Lagrange multipliers (dark line). Middle: Zoom on the 
difference between the constraining force and the Lagrange multi- 
pliers. Note the difference of scales for the y-axis. Bottom: The 
schedule ^ is piecewise constant. In all figures, the x-axis repre- 
sents the blocks of 10^ simulation steps, concatenated for the 101 
different values of z^. 
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It can be checked numerically that the differences |A"+^/^ - /i^d(9",P"^^^^)4^l 
and |A"+3/4_ j-M^(g«+i^pn+i/2) At| jj^^gg^ of order At^, and that the difference 

y.+l/2 ^ ^„+3/4 _ ;M (^„^p„+l/2) At _ ^ p„+l/2) 

is indeed of order At^ (by computing the average of these elementary differences 
for various step sizes) . The Lagrange multipliers are in any case very good approx- 
imations to the constraining force /j*^. 

Let us finally discuss the efficiency of the different estimators of the mean force, 
in terms of their variances. They can be written as the empirical average of the 
following random sequences: 

AJ- \"+l/2 I \n+3/4 



/4d(9",P"),/r.d(9"),' 



At 



where A"+i/2^ A"+3/4 given by the numerical scheme (|3l6| - (f3T7| - ([XT8| . 

The correlations in time (between the iterates) are very similar for the three meth- 
ods, and we therefore simply compute the variance over all the samples. Table [1] 
compares the so-obtained standard errors over 10^ time-steps with At = 0.02 (sim- 
ulation time T = 2000 for each value of the reaction coordinate). The results show 
that the different estimators are more or less equivalent. This is related to the fact 
that the essential source of variance comes from the sampling of the positions, and 
not the sampling of the velocities. Note however that, for the smallest value of the 
reaction coordinate, the estimator based on the averaged local mean force /rgd(9") 
appears to be better in terms of variance. 



A»+l/2 I ;\n+3/4 — ; 



-0.2 


22.1 


21.9 


14.7 


0.0 


16.0 


15.5 


15.4 


0.2 


23.1 


22.5 


22.9 


0.4 


21.1 


20.4 


21.0 


0.6 


21.4 


20.7 


21.3 


0.8 


21.6 


20.9 


21.5 


I.O 


21.4 


20.6 


21.4 


1.2 


21.0 


20.3 


20.9 



Table 1. Standard error (square- root of the variance) of three 
mean force estimators, with correlations in time neglected, for dif- 
ferent values z of the reaction coordinate. 



5. Hamiltonian and Langevin nonequilibrium dynamics 

This section presents nonequilibrium Hamiltonian and Langevin dynamics with 
time-evolving constraints. We thus consider (qt^Pt) solution to the dynamics (SCL), 
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which we recall for convenience: 
M-^pt dt, 

-VV{qt) dt - -fp{qt)M-^pt dt + ap{qt) dWt + Ve(gt) dXt, 

zit), {C,{t)) 

Wc prove in particular the fluctuation identity (|f .f 9p . see (|5.24p below. Recall 
(see (|l.f 6p ) that, for simplicity, we assume in this section that the fluctuation- 
dissipation matrices are assumed to be of the form {crp,jp) = {Pm f^-, Pm 1 Pm) 
with 7, o" e R-^^^^^. At variance with the previous sections, we do not assume 
that 7p is strictly positive. Actually, jp = corresponds to an interesting case: 
the deterministic Hamiltonian dynamics. 

To our knowledge, the standard work fluctuations derived so far (except for 
our previous work [34j ) apply only to the case of time-dependent Hamiltonians. 
It is possible to consider transitions in the values of some reaction coordinate in 
this framework upon resorting to steered molecular dynamics techniques. In this 
case, a penalty term s~'^{^{q) — z{t))'^ (with e small) is used in the energy of 
the system to "softly" constrain the system to remain close to the submanifold 
S(z(t)) ~ {q E M.'^^ I ?(<?) = ^{t)} at time t. However, it is observed in practice 
that the statistical fluctuations increase with smaller e (see [H]). We propose 
instead to replace the stiff constraining potential £^^(^(q) — z)^ by a projection onto 
the submanifold S(z). This is reminiscent of the replacement of stiff constrained 
Langevin dynamics by rigidly constrained ones, see Remark 13.31 

This section is organized as follows. We first define the generalized free en- 
ergy which is naturally computed with (SCL), and relate it to the standard free 
energy (|1.12p in Section 15.11 Then, we give some precisions on the nonequilib- 
rium dynamics (SCL) in Section [5.21 Next, we prove an appropriate version of the 
Jarzynski-Crooks fluctuation equality in Section 15.31 A numerical discretization of 
the noncquilibrium dynamics is proposed in Section [5. 4| together with various ap- 
proximations of the work. In particular, we propose a numerical strategy to obtain 
a Jarzynski-Crooks identity without time discretization error (see Section l5.4.5p . 
We then consider the overdamped limit when the mass matrix M goes to (see 
Section 15. 5p . Finally, in Section 15.61 we present some numerical results for the 
model system already considered in Section |4^ 

5.1. Generalized free energy. For {qt,Pt) solution to the Langevin dynamics (SCL), 
the reaction coordinate evolution ^{qt) — z(t) implies that v^{qt,pt) = z{t), so that, 
at each time t > 0, the system {qt,Pt) belongs to the state space ^^,v^ {z{t), z{t)). As 
a consequence, the free energy difference computed in this section by the Jarzynski 
relation without correction (see (j5.2ip below) is in fact the generalized rigid free 
energy defined in (|4.7p . in the special case S = (^, v^)'^: 

^fgd(C) - -^In / e-^^('^^'Vs,(c)(dgdp). 

The latter free energy is associated to the normalization constant Zz(t).z{t) of the 
distribution fi^^ (z(t),i{t)) defined by (|2.15p . The generalized rigid free energy (|4.7p 
can be explicitly related to the usual free energy as follows. First, remark that, for 



(SCL) 
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a fixed q, 

X^^^^ ^^^^^ exp 

= exp (^-^f^G-,l{q)v^ j^^^^^^ exp ^-^fAr^p"^ ^S.)!^^) 

= (27r/3-i)'^ exp G,-/(<z)«.) . 

In the above, the change of variable p ^ p — V^{q)Gjj {q)vz has been used, in the 
space 



= {pe K^^ I VaqfM-' (p - Ve(g)G^/(<z)«.) - o}. 



Note that ^v^Gj^{q)vz can be interpreted as the kinetic energy of the reaction co- 
ordinate ^. Using the decomposition of measures ()2.24p and the above calculations, 
an alternative expression of the generalized free energy is: 

(5.1) F^^lHz,Vz) = -iln l^^^^exp (^~pV{q) - ^i;jG^/ (g)i'.) 4{,)(dg) + C, 

where, as usual, C denotes a generic constant (independent of z) whose value may 
vary from line to line. As a consequence, the standard free energy (|1.12l) is easily 
recovered from the generalized free energy, using relations similar to (|1.14p . In- 
deed, using (|5.ip . and with computations similar to the ones leading to (|4.2p . the 
difference of the two free energies writes: 
(5.2) 

= -iln / {dctGM{q))~^^^ exp ( ^vjG];^{q)vA fi^^ ^fz.v-)idqdp) + C. 

In practical nonequilibrium computations, the profile 1 1— > F{z{t)) can then be com- 
puted by adding a corrector to the work value in the Jarzynski estimator computing 
F^^^ {z{t) , z(t)) . This yields the identity (|5.24p mentioned in the introduction and 
proved below (see the discussion after Theorem 15. 3p . 



5.2. Dynamics and generators. The explicit expression of the Lagrange multi- 
pliers in (SCL) is obtained by a computation similar to (j3.ip for the case without 
switching, by differentiating twice the constraints over time: 

In view of the special structure of (cp, 7p), this leads to 
d\t = f^uPt) dt + GlliqtYm dt 

+ G^^j'V^qtfM-' {MQt)M-'pt dt - ap{qt) dWt) 
(5.3) = /4dfe,Pt) dt + Glj\qt)z{t) dt. 

The expression (|5.3p does not depend on the fluctuation-dissipation tensors (crp, 7p). 
This leads to simplified computations and motivates the special form of the latter 
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matrices. The momentum evolution (SCL) thus simphfies as 

dpt = -Vl/(9t) dt + V^{qt)f^lMt.Pt) dt + Vaqt)Gjliqt)m dt 

-jp{qt)M-^Ptdt + <jp{qt)dWt. 

Let us denote by Cf the generator of the forward dynamics t {qt,pt) defined 
in (SCL). The latter has a backward switching version, 

t'^{q^,,p^,), 

obtained by using a time reversed switching t' i— > z{T — t'), and by reversing the 
momentum first in the initial condition, and then reversing them back after the 
time evolution (see [S] for more general backward dynamics). More precisely, the 
backward dynamics can be defined through its generator 

(5.5) 4, =7^£f^„,,7^, 

where C^x~t' ^'^^ generator of the forward process at time t = T — t' , and TZ : 4> t-^ 
(j) o S is the momentum flip operator with S{q,p) = {q, ~p). Thus t' i~> [q^,, —p'l,) 
is solution of the forward evolution equation (SCL) with a switching schedule t' 1— >■ 
z(T — t'). Therefore, the time evolution of the backward dynamics is given by 

{dq^^, = -M-^p\,dt', 
dp\, - VF(qb) dt' - 7p(gb)A/-lpt,^ + ap{q'l,)dW^, + V £,{q^,,) dX\, , 
aq^,)^z{T^t'). 

In the following proposition, the expressions of Cl and C^, are explicitly written. 

Proposition 5.1. Consider (^(t) — {z{t), z{t)). Then, the generator of the forward 
process (SCL) at time t € [0,T] reads: 

(5.7) 4 = { • , + /i^n + 2} r-ic(t), 

and the generator of the backward process (|5.6p at time t' £ [0,7] reads: 

(5.8) Cl ^ -{■ ,H]^ + £1"" - {-,2} T-^C{T - t'), 
where 

£|hm^ I^,ifdivp(e-'5^7PV, 
is the fluctuation-dissipation operator defined in 



Proof. First, let us consider the terms in (SCL) arising from the Hamiltonian evolu- 
tion and from the switching (i.e. without fluctuation-dissipation, which amounts to 
setting 7p = and crp = in (SCL)). Since during this dynamics v^{qt,pt) — z{t), 
([3:61) yields: 

{E,H} {quPt) = (Hess,,(0(M"Vt,A^-V) - (VC^Af-^Vy) (g,)) ' 
so that, using (|3.7p . 

(5.9) r-(,„p,)({s,F}(,„p,)-C(t)) = (^Ml^O^W + Cfe,?*)^ . 



LANGEVIN DYNAMICS WITH CONSTRAINTS 



41 



With p.9|) . we then obtain: 
(5.10) 

{^,s}r-i (c{t)-{E,H}) {qt,pt) = {Gil{qt)m + f!^AquPt)fyaqtfyp^{<it,Pt). 

Now, the Hamihonian part of the switched dynamics (SCL) (see also (|5.4p ) can be 
recognized in (|5.10p . so that the generator when {-fpjap) = (0, 0) reads: for any 
smooth test function ip, 

Cliip) = (Va^/d + VeGj/z(t))^ Vp^ - {VVf Vpip+p^M-'V.cp 
^{ip,E}T-'m-{E,H}) + {^,H} 

(5.11) ^{v,H}^ + {^,E}r-^at). 

The full expression of the generator Cl is then obtained by adding the terms arising 
from the fluctuation-dissipation. These terms are directly obtained from the terms 
involving jp and ap in (|5.4p . as in the proof of Proposition 13. II 

The generator of the backward switching process given by (|5.6p can be obtained 
from similar computations. First, the thermostat parts in ()5.6p and in (SCL) are 
the same. Consider now the Hamiltonian part (obtained by taking (7p,(Tp) = 
(0,0)) in the dynamics (j5.6p . By definition of the backward dynamics, and the 
expression (|5.1ip of the forward dynamics, the Hamiltonian part reads 

£'i,{^){q,p)=TZ£tr-t' {nv)){q,p) 

= {vaq)Oiq,p) + ve(<z)G^/((?)z(r - t')f i-Vpip) 

- VV{qf {~Vpv)-p^M-^V,^, 

so that 

C\,^ = -Clr_^,v = - {^, H}^ - {^, E} r-iC(T - t'). 
This gives (HH). □ 

5.3. Jarzynski-Crooks identity. Before stating the main result of this section 
(Theorem 15 .31 below*) . we need to introduce a notion of work. This quantity is most 
conveniently defined for deterministic dynamics, but the corresponding definition 
is also valid for stochastic dynamics. 

We define the work {Wt)t>o associated with the constraining force VS^{qt)d\t 
in (SCL) as the physical displacement multiplied by the force: 

dqt\^ ^ (^^f^^ \ f dqt\^ _-T( 



dWt :- o [vaqt] dXt) = V^qt) o dXt = [t) o dXt 

(5.12) ^2F{t)dXt. 

By convention, Wq = 0. In the above computations, we used successively the fact 
that t n- £,{qt), and then t ^ z{t) are differentiablc processes, so that Stratonovitch 
and Ito integrations arc equivalent. Let us introduce the deterministic version of 
the noncquilibrium process (SCL) (i.e. (7p,crp) = (0,0)): 

dqt = M~^pt dt, 

(5.13) { dpt = -yV{qt) dt + Ve(gt) dXt, 
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and denote by ^t,t+h ■ Sf,!,^ (^(t), — > T,^^vi:{z{t + h),z{t + h)) the associated 
flow between time t € [0, T] and t + h E [0,T]. The work can now be written out 
more expUcitly using the following lemma: 

Lemma 5.2. The infinitesimal variation of the work (|5.12p reads: 

dWt = w{t,qt,pt)dt, 
where for all t G [0,r] and all {q,p) G S^.u^ (z(t), i(t)), 

(5.14) wit, q,p) = atfT-' {S, H} {q,p) 

(5.15) = z{tf {Gll{q)z{t) + f^^M.P)) 

(5.16) =(^±Ho^,,,,)\^Jq,p). 

The total exchanged work is then a time integral associated with the path t n> (qt,pt), 
and is denoted by: 

Wo,T ({gt,Pt}o<t<T) =Wt-Wo=^ w{t,qt,pt)dt. 

Note that the expression (j5.16p can be interpreted as the energy variation of the 
system during the switching when the stochastic thermostat is turned off. 

Proof. The expression of the Lagrange multipliers in (|5.3p yields (|5.15p : 

z{tfdXt - z{tf {Gll{qt)z{t) + f^lMuPt)) dt. 
Moreover, (|5.9p gives: 

z{tf {G-,}{qt)z{t) + f^^MuPt)) dt = atfr-\qt,pt) [{E,H} {qt,pt) - C(t)) 

= C(0^r-l{S,i?} {qupt), 

where in the last line we have used C(0"'"r~^C(i) = 0. This gives (|5.14p . To 
prove (j5.16p . we compute the variations of the energy H{qt,pt) for {qt,pt) solution 
of (|5.13p with initial condition {q,p): 

dH{qt,pt)=pjM-^ dpt+pjM-^VV{qt)dt 
= z{tfd~Xt 

= zitf {G~,l{qt)m + C(gt,Pt)) dt. 

The last equality is obtained using the computation of the Lagrange multipliers 
in dO]). This yields ([5T6)) . □ 

We are now in position to state the main result of this section. 

Theorem 5.3 (Jarzynski-Crooks fluctuation identity). Consider the normalization 
Zz[t),z(t) for the canonical distribution ^.^i^ ^^(z(t),z(t)) defined in (j2.15p . Denote 
{'it,Pt\o<t<T the solution of the forward Langevin dynamics (SCL) with initial 
conditions distributed according to 

(5-17) {qo,Po) ~ lJ-T^^^^^(z(o),z{Q)){dqdp), 

and by {q]^i ,p^,}o<t'<T the solution of the backward Langevin process (j5.6p with 
initial conditions distributed according to 

(5-18) (90,^0) ^ l^Y.^, (z(T),z(T)){dqdp). 
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Then, the following J arzynski- Crooks identity holds on [0, T]: for any bounded path 
functional (p[o.T]: 



(5.19) 



7 E 



'^[o,T] {{quPt}o<t<T) e-^^°'-({'-^'>-['-,)) 



where ( • Y denotes the composition with the operation of time reversal of paths: 

(5.20) <y9[oy](^{gJi,p^,}o<t'<T) = '^[o,T]({'?T-t>-PT-t}o<t<T)■ 
Note that the theorem still holds in the Hamiltonian case, i.e. when {-^p^ap) = 

(0,0). The choice (p[o,T] ~ 1 in (|5.19p leads to the following work fluctuation 
identity: 

(5.21) F^^lHz(T),z{T)) - F^^lHz{Q),m) = -^In [e (c-^^«'-««-^">-[o..,)) " . 

Besides, upon choosing a path functional cxp{9pWn.T), it is possible to obtain a 
family of free energy estimators, parameterized by 9 and where both forward and 
backward paths are weighted by the exponential of some work. Moreover, the 
standard Crooks equality on ratios of probability density functions of work values 
is also a consequence of (j5.19p . see Section 4.2.2 in [35] . 

Note also that the choice 'Pio.T]{qiP) = 4'{'1t,Pt) leads to the following represen- 
tation of the canonical distribution fi-^^ ^ {z{T).i{T))' 
(5.22) 

E(0(gT,Pr)e"'^^°-^(^''''^'^*^[«-^i)^ 



E5,„,(^(T),i(T)) 



(l>{q,p) lJ-T^^,^Az(T),i{T)){dqdp). 



The usual free energy profile z i— > F(z) can therefore be computed using the rela- 
tions (|1.18p - (|1.19p presented in the introduction. Indeed, Equation (|1.19p (see (j5.24p 
below) can be proved by combining (15. 2p and (|5.21l) - (|5.22p as follows: 

F{z{T)) - Fizm ^ {f{z{T)) Ff^^^{z{T)rz{T))) - (f(z(0)) - ^^.^^^ (z(0), i(0))) 

-iln E fe~''^°'^'^^'''^'^'eio-^i''' 



-ilnE((dctGM(gT))-'/'e^^(^)"G,7(g^)i(T) e-/3Wo,T({g*,P.},„o,.,) j 
+ ilnE((dctGM(go))-'/'c^^(o)"«^/(9«)^(")) 



(5.23) 



In 



E (e-/3C(o.9o)) 
where the corrector C{t,q) is defined in (|1.18l) : 

C(t,g) = ^ln(detGM(g)) - hitfG-J{q)m- 
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This leads to the fohowing relation: 

a<t<T 

)+C{T,qT)] 

(5.24) F(.(T))-n.(0)) = --ln(^^ ^ (0-^^(0^.0)) 

Estimators of the free energy based on (|1.19p can then be constructed, see Chapter 4 
in [35] for a review. 

Before turning to the proof of Theoreni l5.31 we first give the general lemma which 
enables to deduce the Jarzynski-Crooks fluctuation identity from a nonequilibrium 
detailed balance condition (similar to the one presented in [5] for switchings arising 
from a time-dependence in the Hamiltonian) . 

Lemma 5.4. Let {qt,Pt)o<t<T (resp. [q^ ,p\)i)<t<T) be a Markov process with 
infinitesimal generator C\ ( resp. C\ ) and initial conditions distributed according 
to (|5.17p (resp. (|5.18p ). Let us assume that the following nonequilibrium detailed 
balance condition is satisfied: for any two smooth test functions ip\, ip2, 

\ (¥'i4(<P2) - v52£T-t(<Pi)) C^" da^^ ,(z(t),i{t)) 

„.(z(t).z(t)) ^ 

(5.25) = / /?w(i,-)vi<^2e-^^das,_ (.(t),i(t)) 

1 / 'PiV2G~^" da^ i,(t)Mt)) ■ 

Then the Jarzynski- Crooks fluctuation identity (j5.19p holds. 

Proof. We use in this proof the short-hand notation Zt, tt^ and St for the parti- 
tion function Z^(^t).z{t)^ (unnormalized) distribution Zz(t),i(t)MS£,„^ (z(t),i(t)) = 
c^Y.^ „^(z(t),z(t)) and the submanifold S^^u^ (^(t), i(i)) respectively. 
Let us introduce the following weighted transition operators: for any bounded 
test function 

(5.26) Pf^(v^)(g,p) =E(^(gT,PT)e-^^--({«-P=>-i--i) | {quPt) = (9,p)), 

(5.27) PIt{v){<1,p) = Av{<i\,Pt) {<1v,pI) = {<1,P)\ 




where {qt,Pt)o<t<T (resp. {<lt ,p]^)o<t<T) is a Markov process with infinitesimal 
generator Cl (resp. £^), and Wt,T = Wo,t — Wo,t- 

We assume that these operators are well defined and smooth with respect to 
time for sufhcicntly smooth test functions defined in an open neighborhood of 
Y.^.„^{z{t),z(t)) and Sj,^,^ (z(t'), i(t')) respectively (for any t,t' e [0,T]). 

The transition operators satisfy the following backward Kolmogorov evolution 
equations: 

'dtPlr = -CIpIt + Pw{t, ■) PIt, (dt'PlT = -^?'^t',T, 
P^j, = U, \p^j.^M. 
Consider now two test functions (po and (px- The balance condition (j5.25[) implies 
d 



dt 



St 



P[t{^t) Pr^tM^o) dnt^ - 0. 
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Integrating this equality on [0,r] yields 

(5.28) / PQ j^{ipT)vod'KQ^ j (pTPo,Ti'^o)dnT, 

^ So <J St 

which is the Crooks identity (|5.19p for path functionals of the form 

no,T]{q,P) = 'Po{qo,Po)'PT{qT,PT)- 

Indeed, 

P^T{VT)^odTTo = ZoELT(gT,PT)¥'o(go,Po)e-'^^'''^"''»'''=^=^i«.^i^ 

''50 ' 

while 

/ PI^tWq) d-KT = '^t('7o'Po) Vo('7t)-Pt) 

J St 

Then, using the Markov property of the forward and backward processes. Crooks 
identity (j5.19p can be extended to finite-dimensional path functionals of the form: 

(5.29) <P[o.T] = fo{qQ,Po)---'Pk{qtk,PtJ---(PK{qT,PT) 

with = to < ti ■ ■ ■ < Ik = T hy repeatedly using a variant of (j5.28p on time 
subintervals \tk,tk+i\ (see the proof of Theorem 4.10 in [35] for further precisions). 
This allows to conclude since finite dimensional time marginal laws characterize the 
distribution on continuous paths, see for instance |19j . □ 

We are now in position to write the 

Proof of Theorem 1 5. 3[ By Lemma 15.41 it is sufficient to prove the nonequilibrium 
detailed balance (|5.25p for the Markov processes {qt,Pt)o<t<T and (qt , Pt)o<t<T , 
solutions to (SCL) and (j5.6|) respectively, with generators Cl and C^, defined by (|5.7p 
and (HiHl). 

First, using Lemma |6.2[ we compute the variation of the unnormalized canonical 
equilibrium distribution with constraints with respect to the switching: 



LpY<~Pl e ^^-^ rfo-Ej,„, {z{t),i{t)-) 



(5.30) =/ aO'^r-i {S, ^1(^2 c-^^} das, 
On the other hand, (|5.14p and Proposition 15. II give 

(5.31) = Wi^2,H}^ + e^" {^1(^26-^^, S} r-iC(t) 

+ (fi^c/^^divp (e-'3^7pVpV'2) - V32^c'3^divp {c->^"jp\/p^i) . 

Now, (|5.25p can be verified in two steps. First, the last two terms in (|5.3ip (the 
"thermostat" terms) cancel out after integration with respect to e~^^ da-^^ ^ {z{t),i{t)) 
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thanks to the detailed balance condition p.l2p . Then, an integration of ()5.31|) with 
respect to e"*^^ da^^ ^ (z(t),z(t)) gives, in view of (|5.30p and (|2.25p . 



which is indeed ()5.25|) . Note that the time- regularity on the evolution semi- groups ()5.26p - 
(|5.27p rcc^uircd to make these computations rigorous is proved in the overdamped 
case in the proof of Theorem A. 5 in f34| . A similar proof can be carried out for 



5.4. Numerical schemes. In this section, a numerical scheme for the nonequi- 
librium dynamics (SCL) and the associated free energy estimator are presented. 
As for Langevin processes with constraints (Section 13. 2|) . a splitting between the 
Hamiltonian part and the thermostat part of the dynamics (SCL) leads to a sim- 
ple and natural scheme (see (|5.32|) - ()5.33|) - ()5.34|) below). Note that a consistent 
numerical scheme in the case of Hamiltonian dynamics can be obtained by consid- 
ering only (|5.33p (this corresponds to 7 = cr = 0). Besides, we propose a discrete 
Jarzynski-Crooks identity without time discretization error, see Section [5.4.51 

The reaction coordinate path is first discretized as {^(0), . . . , z{tiq^)} where Nt 
is the number of time-steps. For simplicity, equal time increments are used, so that 
At = and t„ = TiAf. The deterministic Hamiltonian part in the equations 
of motion (SCL) with switched position constraints ^(g) = z[t) can be integrated 
by a velocity- Verlet algorithm with constraints similar to (|3.17p . The fluctuation- 
dissipation term in (SCL) can be integrated similarly to the constrained case with- 
out switching p.l6p - p.l8p . using an Ornstein-Uhlenbeck process on the momentum 
variable approximated by a midpoint Euler scheme. In conclusion, the splitting 
scheme for the Langevin dynamics with time-evolving constraints reads as follows: 
Take initial conditions (q'^,p'^) distributed according to f , z{ti)~z(to)\ and 




constrained Langevin equations. 



□ 
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iterate on < n < Nt — 1: 
(5.32) 



(5.33) 



pn+1/2 ^ _ ^Vy(g") + VC(g" ) A^+^Z^ , 

C(g"+i)=z(t„+i), 



h3/4 



At 



p«+i/2 _ _vi/(q"+i) + ve(g"+')A"+3/4, 



(5.34) 



^n+l ^ pn+3/4 _ 1 (^"+3/4 ^ ^n+l) 



iCp) 



where (5") and (C/"+^/^) are sequences of i.i.d. Gaussian random variables of 
mean and covariance matrix Idajv- Note that the momenta obtained from (j5.32p - 
satisfy 

(5.35) 

Ve(g")^A/~ V+^/" = Ve(9")^A'/-V" = V^q^fM-^p^-'/^ = '^^^"+^^~ ^^^"^ , 

so that constraints on momenta are automaticahy enforced, and no Lagrange mul- 
tiplier is needed in (|5.32p and (|5.34p . 

Wc comment in the subsequent sections on the different parts of the scheme. 



5.4.1. Comments on the Hamiltonian scheme (j5.33p . The Lagrange multipliers A""''^/^ 
are associated with the position constraints (Cq), and the Lagrange multipliers 
^n+3/4 associated with the velocity constraints (Cp). In (Cp), the velocity of 
the switching at time t„+i is discrctized as: 



2 (in 



+ 1) 



z{tn+2) - z{tn+i) 
At 



The latter choice is motivated by the following observation: The position after one 
step of an unconstrained motion, given by 

A+2 

already satisfies (Cq) up to error terms of order two with respect to At. Indeed, 
using (|5.35p : 

^(g"+i) ^ ^(g") + AtV^(g")'^A/-V"+^/'* + 0(At2) ^ z(<„+i) + 0(At2). 

This property is useful to ensure a fast convergence of the numerical algorithm 
solving the nonlinear constraints {Cq). 
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The numerical flow associated with ()5.33|) is denoted in the sequel as 
(5.36) 



z{t 



n+1) 



Z{tn) 



At 



S^,t,^ z{tn+l), 



z{tn+2) — Z{tn+l) 



At 



n Ti+l/4\ 



It can be proven that $" is a symplectic map. The proof is indeed exactly the same 
as for the symplecticity of the classical RATTLE scheme, see [22l Sections VII. 1.3] 
for an explicit computation for symplectic Euler and [22l Sections VII. 1.4] for an 
extension to RATTLE. As a consequence, $" transports the phase space measure 



to the phase space measure a. 



'=('n + 2)-^('n+l) 



5.4.2. Comments on the fluctuation- dissipation part (j5.32p - (j5.34p . In practice, (|5.32p 
may be rewritten in a form more suited to numerical computations. Of course, 
similar considerations hold for (|5.34p . Since 7p(g) = Pm Pm {q)'^ and (Jp{q) = 
PM{q)cr, (|5.32p is equivalent to: 



1+1/4 



At 



1+1/4 



2VeG^/(g 



z{tn+l) - z{tn) 

At 



(5.37) < 



■ag" + Ve(q")^A"+i/4, 

""l^ A//"^l7i"+l/4 — z{tn+l) — zjtn) 



At 



(Cp) 



where the Lagrange multiplier A"+^/* is associated with the constraint (Cp). The 
equivalence between (|5.32|) and (|5.37p can be checked by multiplying (|5.37p by 
PM(g") and using ([OSl) . 

The Lagrange multiplier A"^^/'* in (|5.37p is obtained by multiplying the above 

equation by V^((7")-^M^^ (id + ^7^1/^^) , and solving the following linear sys- 
tem: 



zjtn+l) - zjtn) 

At 

+ Ve(g")^M-i 
+ Ve(g")'^M"^ 



V^((?")^M-i ( Id 



Id + ^^M-' 



Id + ^7A/"^ 



At 



Id - —-/M- 
4 



P 



(^7A^-'ve(g")G^/(g") 

ve(g")A"+i/4. 



z{tn+l) ~ z{tn) /Af 

2 V 2 



This system is well posed. Indeed, the matrix V^(g")'^'A/"i (Id + ^'^M 



VC(g") 



can be rewritten as V ^{q)'^ S\7 ^{q) with S = A/^^ (id + ^ -fM-'^) \ Both M and 
7 are symmetric and non-negative, so that S is symmetric, positive and invcrtible. 
Finally, the invertibility of V^(g)-^S'V^(g) follows from the invertibility of GM{q)- 
In the special case when 7 and M are equal up to a multiplicative constant, 
the numerical integration can be simplified using the explicit formula p.20p and 
the method described below p.20p . which still holds for the tangential part of the 
momentum. See Section 15.61 below for further precisions. 
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5.4.3. Discretization of the backward process (|5.6|) . The splitting scheme for the 
backward Langevin dynamics with time-evolving constraints ()5.6|) reads as follows: 
Denote n' ~ Nt — n, take initial conditions {q^'^,p^''^) distributed according to 



(5.38) 



St 



pb,n' + l/4 ^ ph.n' 



and iterate on < n' < Nt — 1, 



pb,„' + l/2 ^ pb,„' + l/4 ^ ^vy(gbV) ^ y^(^b,n');^bV + l/2^ 



^b,n +1 



(5.39) < 



^(^'^■"'+1) = z(ijv._„._i), 
At. 



pb.„'+3/4 ^ pb,„' + l/2 ^ f^vy(^b,„' + l) ^ y^(^b,„' + l);^b,„'+3/4^ 



3,ri'+3/4 _ z(tjVj-n/) — z(tjv^_n/^i) 



(5.40) 



At 



At 

r 

b,i 



+ l ^^b,„'+3/4 _ fii^^(^b,„' + l)^^-l(pb,„'+3/4 ^pb,„' + l) 



iN^b.i 



-1/2 



where (C/*^'" ) and {G^'^^ +1/2 ) are sequences of i.i.d. Gaussian random variables of 
mean and covariance matrix IdsA?. The numerical flow associated with (j5.39p is 
denoted 



z{tn)^ 



zitn+l) - zitn) 



-> T.^^v^ z(t„-i) 



z(tn) - z{tn~l) 



At ) ' 

p^,.. rl/4^ ^ ^^b.n' + l^pb,n'+3/4-j 

where we recall n' = Nt — n. Assuming that the flow given by (|5.36p and 
are both well-defined, the following reversibility property is easily checked (extend- 
ing the symmetry property of the standard RATTLE scheme, see for instance [22| 
Section VII. 1.41): 



Id. 



(5.41) $b,ArT-n ^ ^ 

5.4.4. Work discretization and free energy computations. The work (|5.12p can be 
approximated using the Lagrange multipliers in (j5.33p : 



(5.42) 




_^ ;^«+3/4^ 



for n = . . . Nt — 1. The (formal) consistency of the work discretization (|5.42p in 
the time continuous limit is a direct consequence of the work expression (|5.12p . 

An estimator of the free energy profile is then obtained by using K indepen- 
dent realizations of the switching process, computing the work y\>^'^''^ for each 
realization k G {1, . . . , K} (with the numerical trajectories obtained from the nu- 
merical scheme (|5.32p - (|5.33p - (|5.34p and i.i.d. initial conditions sampled according 
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to A*v ( I ^ ^(ti)-2(to) ^ )i ^^"i approximating (|5.23p . rewritten up to an unimpor- 

i^e.i'j (^(*o), St ) 

tant additive constant (independent of T), as 

F{z{T)) ^ -ilnE (e-/5[^"^+c"- , 
with empirical averages such as 

In the above, the discretization C"((7) of the corrector (|1.18[) is 
(5.43) 

C"(.) ^ ^hr (detGM(.)) - \ (^^^^4^)'g-(,) (f(!-iI_iM) . 

We refer to Chapter 4 in [35] for more background on free energy estimators for 
nonequiUbrium dynamics. In particular, it is possible to compute a work associated 
with the backward switching from the Lagrange multipliers in ()5.6p . and to resort 
to bridge estimators (see Section 4.2.3 in [35l). 

However, using approximations such as (|5.42l) in the Jarzynski-Crooks identity 
introduces a time discretization error. We show in the next section how to eliminate 
this error. 

5.4.5. Discrete Jarzynski-Crooks identity. It turns out that a discrete version of 
the Jarzynski-Crooks identity (j5.19p can be obtained. This enables the estimation 
of free energy differences using nonequilibrium simulation without time discretiza- 
tion error. The discrete equality (|5.48p below may be seen as an extension of 
the corresponding equality obtained for transitions associated with time-dependent 
Hamiltonians and performed with Metropolis-Hastings dynamics (see [10] and Re- 
mark 4.5 in [35]). 

For this purpose, we consider a discretization of the work yVo,T using the in- 
terpretation (j5.16p of the work as the energy variation of the Hamiltonian part of 
the Langcvin dynamics. This leads to the following definition of the work at the 
discrete level: 



(5.44) 



>V° = 0, 



for n — . . . Nt — 1- This work discretization leads to a Jarzynski-Crooks identity 
without time discretization error. 

Theorem 5.5 (Discrete Jarzynski-Crooks fluctuation identity). Consider the dis- 
tribution /iSj ^^{z(t),i(t)) ci^d its normalization Zz[t),z{t) defined in (|2.15|1 . Denote by 
{q"' ,p"}o<n<NT solution of the forward discretized Langcvin dynamics (|5.32[) - 
(|5.33p - (|5.34p with initial conditions distributed according to 

(5.45) (g°,P°) „^(^(t,),£(£i)^_i(M)(^9dp), 

and by {g^'" }o<n'<NT solution of the discretized backward Langevin dy- 

namics (|5.38p - (|5.39p - (|5.40p distributed according to 

(5.46) (q'^'",/^°)^u , ,^{dqdp). 



LANGEVIN DYNAMICS WITH CONSTRAINTS 



51 



Then, the following J arzynski- Crooks identity holds on [0,Nt]: for any bounded 
discrete path functional (^[gjv^], 

where W" is computed according to (|5.44p . and ( • denotes the composition with 
the operation of time reversal of paths: 

(5.48) ¥'io,7V.]({'?'^'"',/'"'}o<n'<iV.) = ^lO.N^]{{q'''''^-'\p''^''^^''h<n<Nr)- 

The (formal) consistency of the work discretization (|5.44p in the time continuous 
limit is a direct consequence of the work expression (j5.16p . Free energy estimators 
based on the identity (|5.47p are obtained as described in Section 15.4.41 Let us 
emphasize once again that there is no error related to the finiteness of the time- 
step At in this estimator, and that the only source of approximation is due to the 
statistical error. 



Proof. With a slight abuse of notation, we denote in the same way the random 
variables (g",K), (g'''",P^'"), etc. in or 
and the integration variables in the definition of probability distributions. We 
divide the proof into three steps. 

Step 1: The phase space conservation of $" and and the reversibility prop- 

erty (|5.4ip imply 
(5.49) 

^*,.(,.,p.+i/4)(dg"+i dp"+3/4) (^'?" dp"+^/*) 

= ^<[,b,jvr-..-i(,„+i^p„+3/4)(dg"(ip"+^/'*)cr^^ (z(t„+i) ±zi+i>^phi±ll"j(dq"'^^ (ip"+^/^). 



Step 2: The probability distribution of given {q^,p^) in the discretization 

of the fluctuation-dissipation part ([O^ is denoted K*^^ {q" , p" , dp^+^ ' . The 
scheme (|5.32l) is a mid-point discretization of an Ornstein-Uhlenbeck process, which 
can be rewritten by decomposing the orthogonal and tangential updates of the 
momentum: 



(5.50) 



where p\\ = PM{q^)p, and p± = (Id — PmW''))p- The Markov chain induced by the 
parallel part of the momentum is the same as the one induced by the scheme p.l6p 
(or (|3.18p ) defined in Section [3^ The latter verifies a detailed balance equation 
(both in the plain sense and up to momentum reversal) with respect to the sta- 
tionary measure K^{^^^^{dp) defined by (|3.19p (see Sections 2.3.2 and 3.3.5 in [55]). 

We recall that this measure is defined as the kinetic probability distribution in the 
momentum variable of the canonical distribution tiT*'S{z)idq dp) on the tangential 
space, conditioned by a given q G S(z). Adding the (invariant) orthogonal part of 
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the momentum, the following detailed balance condition is satisfied: 
(5.51) 

exp (^^{p-fM-^pA K°^{q-,p\dp^^+^^^)a^'-' /^^^ii^-^Jdp") 

Step 3: Denote by isTf dp"+^) the probability distri- 

bution of the variables (g"+i^p"+i/4 pn+3/4 ^n+i-j gjycn the variables {q",p^^) in 
the scheme ([5:321) - ((5:33)) - (|534)) : and by K^{q^ ''' ,p^ ''' ■,dq^''''+^ ,dp^ '''+^^^, dp^'^^'+^Z^ , dp^-'''+^) 
the probability distribution of the variables (qb,n'+i^pb,n'+i/4^pb,«'+3/4^pb,n'+i) 

given the variables {q^-'^' ,p^-''^') in the scheme The splitting 

structure yields: 

K^'''{q'^,p";dq"+^ dp"+^/^ dp^'+^Z* dp^+^) 

= K^^{q-+\p-+^'\dp-+^)5^^^^.^^.,U^^{dq-+^dp-+^'^)^^^ 

as well as 

xA'°U(^b,n'^pb,«'^^pb,«' + l/4)^ 

Combining the detailed balance conditions (j5.49p and (|5.5ip of Steps 1 and 2, and 
using the decomposition p.24p of phase space measures, it follows 

= A->^'^^-"-i(g"+i,p"+i; dg" dp"+3/4 ^pn+i/4 

^5, "5 (2(tn + l), St ) 

which can be seen as the Jarzynski-Crooks identity over one time-step. Iterating 
the argument, it is easy to obtain: 

e-''^""A'f^O(g",/;dgi V/4dp3/4^pi) _ , K'^^^-\q^^-\p^^-^;dq^^ dp^-'^/^ dp^--i/4 ^piv, ) 

= A'b.A^T-l(^l^pl. rfgO ^p3/4 ^^0) _ K'^fi^qNT^pNr.dqNr-l dp^T-l/i dp^T-3/4 rf^iV.-l) 

xe , -(*jvt+i)-- ap j, 



which yields (|07|) . □ 



5.5. The overdamped limit. 
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5.5.1. An exact free energy estimator for the overdamped Langevin dynamics. The 
splitting scheme (|5.32p - (|5.33p - (|5.34p can be used in the overdamped regime, using 
the method of Proposition by choosing 



(5.52) 



-U, 



which implies = 2P'[jPm and ap = -^Pm- For this choice of parameters, the 
continuous limit of the numerical scheme is the following variant of the stochastic 
differential equation p.23|) : 



(5.53) 



dqt = -VV{qt) dt 
Mlt) = z{t), 



^dWt+\7aqt)dXf, 



where A""^ is an adapted stochastic process such that ^{qt) = z{t). We then ob- 
tain the following Jarzynski-Crooks relation for discretized overdamped dynamics, 
without time discretization error. 

Proposition 5.6. Suppose that the relation ()5.52p is satisfied. With a slight abuse 
of notation, the mass matrix and the friction matrix are rewritten as M Id and 7 Id 
with A/, 7 e M. Then the splitting scheme (|5.32p - (j5.33p - (|5.34p yields the following 
Euler discretization of the overdamped Langevin constrained dynamics (|5.53p ; 



/2At 



(5.54) 



q"+l = - AtVl-(q") + ^ — 5" + Ve(<z") Kt\ 
^e(g"+i) = ^(Wl), 



where {Q^)n>o cire independent and identically distributed centered and normalized 
Gaussian variables, and (A"jj)„>i are the Lagrange multipliers associated with the 
constraints (■^(9") = z{tn))o<n<NT- same way, the backward process (j5.38p - 

()5.39p - ()5.40p yields the following Euler scheme: 



(5.55) 



Consider the work update 
'W° = 0, 

. Nt — 1 , where 




1 

At 



n+3/4 



n+1/4 



1 2 At 



^ ^ P(g")g" + Ve(g")G-i (g") (z(t„+i) - zitr.)) , 
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with G = V^-^V^, and the scheme ()5.54p is rewritten as: 

At 



(5.57) 



vy(g"+i) + ve(g"+')A"+3/^ 



Ve(g"+i)V+'/' - ^ (Cp) 



T/ien the J arzynski- Crooks relation (j5.47p /lo/ds under the assumptions (j5.45p and (j5.46p 
071 the initial conditions of the schemes (|5.54p and (|5.55p respectively. 



The proof is a direct consequence of the reformulation of (|5.54p into (|5.57p , and 
a direct apphcation of Theorem 15 . 51 with the parameters (|5.52p . 
Note that the free energy estimator 



(5.58) 



F{z{T)) = -ihrE {^^-p[y^"-+c^n<i"n] 



based on the work (|5.56p and the corrector 
(5.59) 



C 



"(,) = ^ln(detG(,)) - ^ (^iiWil^y ^.(Wr) - .(*„) 



At 



is exact (there is no time discretization error). This free energy estimator can be 
seen as a variant of the estimator proposed in [33] , which was derived directly for 
the scheme (|5.54p . and reads (up to an unimportant additive constant): 



(5.60) 

where the work is defined as 

'w" = 0, 



F{z{T)) ^ -- InE (c"^[w~-+c(g~-)]^ 



(5.61) 



z{tn+l) - z{tn) \ r„+i 



with 



2At 



In+i ^ 2A-+1/2 = A„"+' - G-i(g")(z(t„+i) - z(t„)) + G-i(g")Ve(<?")^g", 



and the modified corrector is defined without the kinetic energy term: 
(5.62) c(g)==^ln(detG(g) 



There is a bias due to the time discretization error in the estimator (|5.60p . which 
can be removed upon following the procedure described in Proposition l5.6l 
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5.5.2. Consistency analysis of three free energy estimators. In this section, we 
would like to discuss the consistency of three free energy estimators introduced 
above: (|5.60p - (j5.6ip (based on the direct discretization of the overdampcd dynamics 
proposed in [S^), (|5.58p - (|5.42p (which uses the Lagrange multipliers to approximate 
the work) and (|5.58p - (|5.56p (based on the discrete Jarzynski equality). 
The limiting continuous-in-time version of the Jarzynski relation is: 

(5.63) Fiz{T)) = InE (c-''^«'^(^«*>«<'<^)) , 
where the work for the overdampcd dynamics (|5.53p reads (see |34] ) : 

(5.64) WS% {{qt}o<t<T) = 1^ Atf d\f, 
with 

dXf = dXf - G-\qt)z\t) dt + ^G-\qt)Vaqtf dWt. 

The consistency of (j5.60p - (j5.6ip with (|5.63|) - (|5.64p was already proven in |34j . 

Concerning the consistency of C" with C (see (|5.58p and (|5.60p ). note that in 
the overdampcd scaling {M = -^Id), the difference 

Ciq) - C"(,) = ^ ^^(Un^lMy G-\q) ^^(hll^lM^ . 0(At) 

vanishes when At — >■ 0. This difference can therefore be neglected when analyz- 
ing the consistency of the scheme in the continuous-in-time limit. We henceforth 
concentrate on the consistency of the works ()5.42|) and (|5.56|) with ()5.64|) . 

In the sequel, we denote the anticipating stochastic integration of the integrand 
Yt with respect to dXt by 

Yt-dXt = 2Yt odXt-Yt.dXt, 

where o is the Stratonovitch symmetric integration, and . the Ito integration. The 
symbol ~^ denotes the formal time continuous limit. 

Consistency of (|5.42p . Let us justify the consistency of the work expression (|5.42p 

with (|5.64p . Remark that the Lagrange multipliers in (|5.57p verify: 

(5.65) 

2A"+i/2 = G-i(g") [vC(g")^ - g") - (z(t„+i)-z(i„)) + AtVC(<?")^VF(g")" 
and 



(5.66) 2A"+i/2 = - G-^(<z")(z(t„+i) - z(t„)) + ^^G-i(g")VC(<z") V\ 

as well as 
(5.67) 

2A"+3/4 = G-i(g"+i)[ve(g"+i)^ (g" - + (z(t„+2) - ^(t„+i)) + AtV^(g"+i)^Vl/(g"+i)' 

The expressions (|5.65p and (|5.67p yield 

(5.68) 2A"+l/2 ^ (Ve(<?t)^ .dqt - z'{t) dt + ^i{qtf^V{qt) dt) , 

as well as 

2A"+3/4 ^ G~\qt) {-^aitfdqt + z'{t) dt + Ve(gt)^Vy((?t) dt) . 
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Moreover the constraints imply that 

(5.69) d^qt) = z'{t) dt = Veiqt) odqt = ^ (V^^(<?t) .dqt + VeiqtYdqt) , 

SO that A"+^/^ and A"^'^/'* yield the same time continuous limit, that is to say 

(5.70) 2A"+3/4 ^ G-\qt) {Vaqtf -dqt - z'{t) dt + V^{qtfVV{qt) dt) . 
Eventually, (|5.66p implies 

(5.71) 2A"+i/2 ^ ^3^od^ 

the same holding true for 2A"+3/^. The work expression (|5.42p is thus formally 
consistent with (|5.64p . This concludes the proof of the consistency of (|5.58p - (|5.42p 

with dusii-dn;!!]). 

Consistency of (|5.56p . We now prove the consistency of the work expression (|5.56p 
with dnjH). Define 



.r = -^vy(g")+ve(g")A"+i/2, 



fn+i ^ _^vi/(g"+i) + V^((7"+i)A"+3/4. 



The expression (|5.56p yields using ([5.57^ : 



n+1 



w 

(5.72) 
(5.73) 
where 



-W" = V{q"+')-Viq") + — 



1+1/2 _|_ jn+1 



1+1/2 _ 



f 



= Viq"+') - y(9") + — (/" + r+') ■ - - /" + /"+i) 



At 
1 



: l/(g"+^) - y(g") - -(Vl/(g") + Vy(g"+^)) • [q 



,,n+l „n\ 



/» ^ i_ (ve(g")A"+i/2 + ve(9"+^)A"+3/4^ . I^qn+l _ ^n-) ^ 

First, since y(g"+i) - y(g") Vy(gt) odqt and i(Vy(q") + Vy(g"+i)) • ((7"+i - 
g") ^ _vy(^gj) o the limit of the terms in (|5.72p is zero. Second, using the 
expressions (|5.65p and (|5.67p . and similarly to (|5.68p and (|5.70p . it holds 
(5.74) 



/"-/"+i = (veG-i)(g") v^q'Y {q 



n\T i „n+l n\ 



(z(t„+i) - z(t„)) + AtV^(g")^Vy(g") 



- (VCG-i)(g"+i)[ve(9"+^)^ (g" - + (z(t„+2) - ^(Wi)) + AtV^(g"+i)^Vy(g"+i) 



+ ^(Vn'Z"+')-Vn'Z"))=o(At) 



since (V^G"i V^(g")+VCG-iV$(g"+i))^ 



2VCG-i(9t)-z'W by (lOgi) . 



Expanding in higher order powers of At, it can be checked that there exists two 
functions a and h such that 



/" - / 



Ji+1 



At 



a(t, qt) dt + b(t, qt).dqt. 
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Therefore, since (in the Hmit At — > 0) tlic martingale part of /" + /"^^ arises only 
from the term V ^G~^W ^'^ (qt) -dqt , one obtains 

pri+l 



\dl^j^ V^G-^Ve{qt)P{qt).dWt, b{t,qt)P{qt).dWt^ =0 



2 

since {qt)P{qt) = 0. In conclusion, the second term in (|5.73p has a zero contri- 
bution to the continuous-in-time limit. 

As a consequence, the formal time continuous limit of >V"+^ — W" is the same 
as the one of Computations similar to the one performed above yield 

J" = 2^(9"^' - '7")^(V?(g"+') - Ve((7"))(A"+i/2 - A"+3/4) = o(Ai). 

Indeed, X^+^Z^ ~ ^ Q^^i) as in (|5J4)) . while - g")^(V^(g"+i) - 

^■^(9")) = 0(At). The formal time continuous limit of /" is therefore the same as 
the limit of 

/" + J" = 2^(9"+' - I'f (Ve(9") + Ve(g"+')) (a"+i/2 + A"+3/4^ ^ 
Since (|5.69p implies 



T 

-g") z'(t) dt, 



2 

we get in the end that the formal time continuous limit of /" and W'+i — W" is 
the same as: 

where we have used (j5.7ip . This concludes the proof of the consistency of (|5.58p - 
((536)) with ((533)) - ((534)) . 

5.6. Numerical illustration. We present some free energy profiles obtained with 
nonequilibrium switching dynamics for the model system and the parameters de- 
scribed in Section The switching schedule reads 

■^(0 — -2-min (^max ^min)^ 

with Zmin = —0.1 and Zmax = l-l- The time-step is At = 0.01. The initial 
conditions are obtained by first subsampling a constrained dynamics with ^(q) = 
Zmin and v^{q,p) = 0, with a time spacing Tgampic = 1; and then adding the required 
component V^{q)G~^ z{0) to the momentum variable (with i(0) = (^max— •Zmin)/?')- 
In the specific case at hand, the corrector term (jl.lSp is constant, and free en- 
ergies differences are equal to differences of rigid free energies. The dynamics used 
to integrate the nonequilibrium dynamics is based on a splitting strategy, analo- 
gous to (|5.32p - (|5.33p - ()5.34p . except that the midpoint integration of the Ornstein- 
Uhlcnbeck part is replaced by an exact integration for the unconstrained dynamics, 
followed by a projection. This can be done here since we choose a friction matrix of 
the form 7 Id (recall also that M = Id) . More precisely, the corresponding scheme 
is obtained by replacing (|5.32p (and likewise for (|5.34p ') with 
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0.0 0.2 0.4 0.6 0.8 1.0 

Reaction coordinate 



Figure 3. Free energy profiles. The top curve corresponds to 
T = 1 with M — 10^, while the two other curves were obtained for 
T = 10 with M = 10-* and T = 100 with M = 10^ (smoothest 
curve). 

where a = e"^'^*, and setting p«+i/4 ^ ^"+1/4 ^ a"+i/'^V^((7") with A"+i/4 chosen 
such that 

Figure |3] presents estimates obtained with M independent realizations of the 
switching dynamics for different switching times T, using the estimator presented 
in Section [5.4.4l with the work discretization ()5.42p . In all cases, the product MT is 
kept constant. The free energy profile becomes closer to the reference curve as T is 
increased, and the profile obtained for T = 100 is in excellent agreement with the 
result obtained with thermodynamic integration. When the switching time is small, 
more realizations should be considered to reduce the statistical errors and obtain 
estimates in better agreement with the reference profile. The fact that the variance 
is very large when the switching time T is too small is a well-known drawback of 
estimators based on the Jarzynski- Crooks identity, see the review in Sections 4.1.4 
and 4.1.5 in [3S]. Roughly speaking, the difficulty is related to the fact that the 
free energy difference is obtained as an average of exp(— /3W), which requires a very 
good sampling of the small values of the work W. As T decreases, the width of 
the work distribution increases and the low tail part is more and more difficult to 
sample. Improved estimates can be obtained with estimators based on combinations 
of forward and backward trajectories, see for instance [10] and Section 4.2 in [55] . 

6. Appendix: Some technical results 

We give in this appendix two technical lemmas, used in the proof of Proposi- 
tion |43] The first lemma can also be used to prove the divergence formula ()2.25p . 

Lemma 6.1. For any a G {1, . . . , 2m} : 

2m 

(6.1) ^{|detr|i/^(r-i),,b,Sb} = o, 

b=l 

where T is defined in (j2.9p . 
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Proof. The proof relies on the following computation rules for any family of invert- 
ible square matrices 6 t-^ Ae: 

(6.2) i^fin|detA,|)=tr('A-i-|A, 



d9 V ' "'J V » de 

and 

Fix a £ {1, . . . , 2m}. First, using (|6.3p with Ag replaced by F and ^ replaced by 
{•, Sc}, we obtain 

2m 2m 

ra,6{(F ^)h_c,Sc} = - ^ {Fa,b,Sc} (F~^)f,_c, 
b,c=l &,c=l 

SO that by the skew-symmetry of F~^ and F, 

J2 ra,b{{r-')b,c,^c} = J2 -2(ira,fc,S,} + {Fe,a,Sfa})(F-l)b,e. 

b,c=l b,c=l 

Jacobi's identity for Poisson brackets and (j6.2p then yield 

2m 2m 



E r,,,{(F-iK„s,} = i {{s,,sj,s4(F-i), 

fc,c=l 
2m 

t,c=l 

i {In |detF| ,SJ = - IdctFr^/' {|detF|l/^S,} 

2m 

E |detFri/2F,,b(F-i)b,,{|dctF|i/2,S,} 



2 

b,c=l fc,c=l 

2 m 

2 

t,c=l 



fc,c=l 



since Fa.t(F )f,.c = ba.c where is the Kronecker symbol. Finally, the left hand 
and right hand sides of the last equality can be factorized as 

2m 

|detF|-i/'F,,fc{|dctF|i/'(F-i)fc,„S,} =0. 

t,c=l 

Since |detF| > and F is invertiblc, it follows 

2m 

E{|detF|i/^ (F-i),,„S,}=0 

c=l 

for all 6 = 1, ... , 2m, which is dUI]). □ 
Lemma 6.2. For any compactly supported smooth test function ip on R^^; 

where the phase space Se:(C) defined in (|2.7p . anrf i/ie Gram matrix F m (|2.9p . 
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Proof. Consider a test function (j> : E^™ — >■ K. An integration by parts and the 
co-area formula p.l9p give: 

0(C)Vc / ip{q,p)(7^^(^Q{dqdp)] dC 

y^Bic) J 

Vc'?^(C) / ^iq,P)<^T,^{o{dqdp)] dC 

= -/ {E,(l)oE}ipdet{ry^^ dqdp, 

where in the last line the following chain rule has been used: 

{S, o S} iq,p) = {S, S} (g,p)Vc0(S(g,p)) = r(g,p)Vc0(S(g,p)). 
Now an integration by parts with respect to dq dp, together with (j6.1l) , leads to 

2m 



/ = V/ 0os|s6,det(r)i/2r-V} rfgrfp 
.osr-i{s,^} dct(r)i/2 dgrfp 




{S,(^} dcrEH(C)J ^C, 

which gives the result. □ 



[lo; 

[11 
[12; 

fi3- 
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